发明名称 Information retrieval system for archiving multiple document versions
摘要 An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents ate the indexed according to their included phrases. Index data for multiple versions or instances of documents is also maintained. Each document instance is associated with a date range and relevance data derived from the document for the date range.
申请公布号 US9384224(B2) 申请公布日期 2016.07.05
申请号 US201314082501 申请日期 2013.11.18
申请人 Google Inc. 发明人 Patterson Anna L
分类号 G06F7/00;G06F17/30;G06F17/22 主分类号 G06F7/00
代理机构 代理人
主权项 1. A system comprising: a phrase-based index that includes a plurality of posting lists, each of the posting lists including: a phrase, anda list of documents associated with the phrase; at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the system to perform operations including: receiving, on a first date, a current version of a previously indexed version of a document, the previously indexed version being associated with a closed date of a date range that represents current validity,determining that the current version differs from the previously indexed version,generating a document identifier for the current version that is based on the first date,determining at least one phrase associated with the current version,updating the closed date of the date range for the previously indexed version based on the first date,setting a closed date of a date range for the current version to a status that represents current validity, andupdating a posting list for the at least one phrase to include the document identifier.
地址 Mountain View CA US