摘要 |
<P>PROBLEM TO BE SOLVED: To determine the predicted number of documents with small deviation among servers in an environment where inverted indexes for documents having words included in a plurality of servers exist in a distributed manner. <P>SOLUTION: A document number prediction method comprises: a step for dividing a compound word into individual words if an input query is a compound word; a step for extracting inverted indexes corresponding to the respective divided words by referring to inverted index storage means; a step for calculating the number of all documents including the compound word (a first document number) based on the extracted inverted indexes; a step for dividing the extracted inverted indexes into a plurality of blocks and calculating the number of documents in each block including the compound word (a second document number); a step for calculating a distribution correction value which indicates a deviation of distribution of documents including the compound word found in the entire block; and a step for calculating a prediction value of the number of documents including the compound word out of all documents by using the distribution correction value. <P>COPYRIGHT: (C)2012,JPO&INPIT |