发明名称 Method and system for enhanced data searching by parsing data into syntactic units
摘要 A syntactic query engine for transforming at least one sentence of a document or query into a canonical representation using entity tags comprises a memory medium containing a parser that is configured to: receive a designation of a plurality of entity tags and decompose the at least one sentence to generate a parse structure for the sentence having a plurality of syntactic elements that correspond to a part of speech determine from the structure of the parse structure a set of meaningful terms that correspond to one or more of the designated entity tags and for each of one or more of the meaningful terms, store, in an enhanced data representation data structure. The representation includes the term and the corresponding entity tag type, such that the at least one sentence is represented in the data structure by at least one entity tag. Each entity tag has a type and a value and the type of each entity indicates a possible attribute of a sentence that foes not represent a part of speech and does not represent a grammatical role. A query engine for is searching a corpus of documents containing a parser and a postprocessor is disclosed. Each document has a plurality of sentences, and the corpus having an index of the plurality of sentences for the documents. The parser is structured to receive an indication of a plurality of consecutive sentences; and decompose the indicated plurality of consecutive sentences to generate a plurality of search terms for searching the document corpus. The postprocessor is structured to determine a plurality of result sentences in the corpus that correlate to the search terms using latent semantic regression techniques to determine the similarity of the search terms to the sentences in the corpus of documents; and return indications of the determined result sentences.
申请公布号 NZ542223(A) 申请公布日期 2007.08.31
申请号 NZ20040542223 申请日期 2004.02.12
申请人 INSIGHTFUL CORPORATION 发明人 MARCHISIO, GIOVANNI B;KOPERSKI, KRZYSZTOF;LIANG, JISHENG,;MURUA, ALEJANDRO;NGUYEN, THIEN;TUSK, CARSTEN;DHILLON, NAVDEEP S;POCHMAN, LUBOS
分类号 G06F15/00;G06F17/27;G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F15/00
代理机构 代理人
主权项
地址