摘要 |
PROBLEM TO BE SOLVED: To solve the problem that since an index is prepared in each divided document unit because of size reduction of the index in a conventional method, the appearance position information of an index word is not positioned on an appearance position of the divided document and the retrieval efficiency of each divided document is low, and because the division unit is based on the same reference of all retrieval words, the retrieval size of a retrieval word of high frequency may be extremely increased due to the increase of document volume. SOLUTION: An index word extraction part extracts an index word and an appearance position of the index word in a document from the document, an index type determination part determines an index type of an index to be created from index type of the extracted index word which is acquired by an index type acquisition part, an index preparation part prepares an index including an appearance position list of the index word in the document, the index type and an index size on the basis of the determined index type and stores the prepared index in an index storage part. When the index of the extracted index word has been already stored in the index storage part, the index sort acquisition part acquires the index type and the index size from the index storage part and the index sort determination part determines the index type of the prepared index from the index type and the index size to include the index size within an upper limit value. COPYRIGHT: (C)2008,JPO&INPIT
|