发明名称 DOCUMENT RETRIEVAL SYSTEM AND SEARCH METHOD USING WORD SET AND CHARACTER LOOK-UP TABLES
摘要 A computer-operated document retrieval system includes a lexicon of wor ds contained in system documents, and a document look-up table that relates wor ds by unique word numbers to the documents. A word look-up table identifies sets o f words with common characteristics, specifically prefix value and word length, and a character look-up table identifies whether any word contains a specified character. A target s et generator accesses the word look-up table to compose a target word set with characteristics corresponding to the search string. A refining module reduces the target set by selecting a set of characters from the search string, and accessing the character look - up table to identify which target words use the character set. The character look-up tab le is a boolean array with one bit elements that are processed in groups whose size corresponds to the maximum bit processing count of the computer, effectively culling non-matchi ng words simultaneously. A string comparison module determines whether any word remaining in the target set matches the search string. The system quickly executes variou s searches, including prefix, exact match, wildcard, and fuzzy searches.
申请公布号 CA2340531(C) 申请公布日期 2006.10.10
申请号 CA20012340531 申请日期 2001.03.12
申请人 IBM CANADA LIMITED - IBM CANADA LIMITEE 发明人 GREEN, ROBIN A. R.
分类号 G06F17/30;G06F17/20 主分类号 G06F17/30
代理机构 代理人
主权项
地址