发明名称 Overlapping subdocuments in a vector space search process
摘要 The present invention is a method and apparatus for retrieving information from a database. Initially, the documents within the database are divided into mutually exclusive subdocuments that generally correspond to paragraphs of text. The present invention further creates a second set of subdocuments that overlap adjacent paragraphs of text. In particular, the location of the overlapping subdocuments depends on the size of the initial paragraphs. This second set of overlapping subdocuments are scored just as the mutually exclusive subdocuments are scored. The scores from both the mutually exclusive and overlapping subdocuments are used in ranking the relevance of documents to a query. The use of both sets of subdocument scores improves the effectiveness of the scoring algorithm.
申请公布号 US6205443(B1) 申请公布日期 2001.03.20
申请号 US19990225115 申请日期 1999.01.04
申请人 CLARITECH CORPORATION 发明人 EVANS DAVID A.
分类号 G06F17/30;(IPC1-7):G06F12/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址