摘要 |
The present invention is a method and apparatus for retrieving information from a database. Initially, the documents within the database are divided into mutually exclusive subdocuments that generally correspond to paragraphs of text. The present invention further creates a second set of subdocuments that overlap adjacent paragraphs of text. In particular, the location of the overlapping subdocuments depends on the size of the initial paragraphs. This second set of overlapping subdocuments are scored just as the mutually exclusive subdocuments are scored. The scores from both the mutually exclusive and overlapping subdocuments are used in ranking the relevance of documents to a query. The use of both sets of subdocument scores improves the effectiveness of the scoring algorithm.
|