发明名称 METHOD AND APPARATUS FOR AUTOMATED SEARCH AND RETRIEVAL PROCESSING
摘要 This invention provides a method and apparatus for automated search and retrieval processing that includes a tokenizer, a noun phrase analyzer, and a morphological analyzer. The tokenizer includes a parser that extracts characters from the stream of text, and identifying element for identifying a token formed of characters in the stream of text that include lexical matter, and a filter for assigning tags to those tokens requiring further linguistic analysis. The tokenizer, in a single pass through the stream of text, determines the further linguistic processing suitable to each particular token contained in the stream of text. The noun phrase analyzer annotates tokens with tags identifying characteristics of the tokens and contextually analyzes each token. During processing, the noun phrase analyzer can also disambiguate individual token characteristics and identify agreement between tokens. Themorphological analyzer organizes, utilizes, analyzes, and generates morphological data related to the tokens. In particular, the morphological analyzer locates a stored lexical expression representative of a candidate token found in a stream of natural language text, identifies a paradigm for the candidate token based upon the stored lexical expression, and applies transforms contained within the identified paradigm to the candidate token.
申请公布号 WO9704405(A1) 申请公布日期 1997.02.06
申请号 WO1996US12018 申请日期 1996.07.19
申请人 INSO CORPORATION 发明人 CARUS, ALWIN, B.;WIESNER, MICHAEL;HAQUE, ATEEQUE, R.;BOONE, KEITH
分类号 G06F17/27;G06F17/28;G06F17/30;(IPC1-7):G06F17/28 主分类号 G06F17/27
代理机构 代理人
主权项
地址