摘要 |
<P>PROBLEM TO BE SOLVED: To construct an index comprising words which have meanings as a document by removing only a blank character for shaping among blank characters included in a structured document when composing the index. <P>SOLUTION: When a text in a structured document detected by a syntax analyzing part 52 is an ignorable blank character composed of only blank characters, a first blank character decision part 531 stores the blank character as a blank character for shaping in a blank character information storage part 55 while associated with the structure information of the next element including the following start tag. A second blank character decision part 532 compares the blank character string in the text and the structure information of a structure relevant to the blank character string with the information stored in the blank character information storage part 55 for deciding whether or not the blank character string is a blank character for shaping. An index preparation part 56 erases the blank character determined as the blank character for shaping from the text, and prepares an index to be used for the retrieval of the structured document including the text. <P>COPYRIGHT: (C)2009,JPO&INPIT |