发明名称 SYSTEM AND METHOD FOR PORTABLE DOCUMENT INDEXING USING N-GRAM WORD DECOMPOSITION.
摘要 <p>A system and method provides for indexing and retrieval of stored documents using a decomposition of words in the documents in n-grams, or linear word subunits. The documents are indexed as pages in a number of banks. For each bank there is a bank index. The individual n-grams are identified for each page and are stored in the bank index. Each bank index further contains an entry map that indicates whether a given n-gram is present in any of the pages of the bank, and then provides an index to a page map that further indicates which page in the bank contains the n-gram. When a search query is input, the query words are decomposed into their n-grams. The query word n-grams are compared first with entry maps to determine if the query word n-grams appear on any page in the bank. If so, the associated page map is traversed to determine which page in the bank contains the query word n-grams. The n-grams on the page are compared with the query word n-grams to determine the presence of a match therebetween. Matching pages are flagged. When all pages in all blanks have been processed, the pages are consolidated with respect to the documents to which they belong, resulting in a list of documents that match the search query. The results are displayed to a user.</p>
申请公布号 MX9606259(A) 申请公布日期 1998.02.28
申请号 MX19960006259 申请日期 1996.04.10
申请人 REBUS TECHNOLOGY, INC. 发明人 VIJAYAKUMAR RANGARAJAN;NATARAJAN RAVICHANDRAN
分类号 (IPC1-7):G06F17/27 主分类号 (IPC1-7):G06F17/27
代理机构 代理人
主权项
地址