发明名称 Information retrieval utilizing semantic representation of text by identifying hypernyms and indexing multiple tokenized semantic structures to a same passage of text
摘要 The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each have an "is a" relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms. The tokenizer is preferably used to generate tokens for both constructing an index representing target documents and processing a query against that index.
申请公布号 US6161084(A) 申请公布日期 2000.12.12
申请号 US19990366499 申请日期 1999.08.03
申请人 MICROSOFT CORPORATION 发明人 MESSERLY, JOHN J.;HEIDORN, GEORGE E.;RICHARDSON, STEPHEN D.;DOLAN, WILLIAM B.;JENSEN, KAREN
分类号 G06F17/27;G06F17/30;(IPC1-7):G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址