发明名称 Systems and methods for structural indexing of natural language text
摘要 A structural natural language index is created by segmenting documents within a repository into text portions and extracting named entity, co-reference, lexical entries, structural-semantic relationships, speaker attribution and meronymic derived features. A constituent structure is determined that contains the constituent elements and ordering information sufficient to reconstruct the text portion. A functional structure of the text portions is determined. A set of characterizing predicative triples are formed from the functional structure by applying linearization transfer rules. The constituent structure, the characterizing predicative triples and the derived features are combined to form a canonical form of the text portion. Each canonical form is added to the structural natural language index. A retrieved question is classified to determine question type and a corresponding canonical form for the question is generated. The entries in the structural natural language index are searched for entries matching the canonical form of the question and relevant to the question type. The characterizing predicative triples are used in conjunction with a generation grammar to create an answer. If the generation fails, some or all of the constituent structure of the matching entry is returned as the answer.
申请公布号 US2007073533(A1) 申请公布日期 2007.03.29
申请号 US20060405385 申请日期 2006.04.17
申请人 FUJI XEROX CO., LTD. 发明人 THIONE GIOVANNI L.;VAN DEN BERG MARTIN H.
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址