摘要 |
PROBLEM TO BE SOLVED: To provide a method and a device for high speed full text search with all the sentences of a document as a retrieval object for providing the retrieved result within a practically allowable sufficient retrieval time. SOLUTION: When registering a document into a document data base, character strings in the text of the register document are divided for each of character types such as KANA and KANJI, the inclusive relation of character strings between respective divided partial character strings is investigated, the condensed text composed of the set of partial character strings excluding character strings included in the other character strings is prepared, a character component table registering characters without overlap, which appear in the condensed text is prepared and in addition to the text of the register object document, the condensed text and the character component table are registered in the document data base together. At the time of retrieval, while referring to the character component table, a document containing the characters of a designated keyword is extracted and next, while referring to the condensed text of the extracted document, only a document corresponding to the condensed text containing the partial character string of the designated keyword is extracted. Then, while referring to the text of this extracted document, only the text satisfying retrieval conditions added between keywords is extracted.
|