发明名称 Extracting and displaying compact and sorted results from queries over unstructured or semi-structured text
摘要 A system for indexing unstructured or semi-structured data is disclosed. The system may identify regions within the data, such as “Abstract” or “References”. The system may identify linguistic units such as sentences, noun groups, verb groups. The system may also identify concepts such as companies, people, diseases, amounts, and so forth. The query results may be formatted so that similar results from different documents, or from the same document, are clustered together.
申请公布号 US9031926(B2) 申请公布日期 2015.05.12
申请号 US201213414372 申请日期 2012.03.07
申请人 Linguamatics Ltd. 发明人 Milward David R.;Thomas James R.;Knight Sylvia F.;Hale Roger W.
分类号 G06F17/30;G06F17/27 主分类号 G06F17/30
代理机构 Perkins Coie LLP 代理人 Perkins Coie LLP
主权项 1. A non-transitory computer-readable storage device comprising instructions that, when executed by a computer system, cause the computer system to: receive a query that includes at least one linguistic constraint and an indication of at least one region; cause a data structure to be searched, based on the received query; identify, within the data structure, at least one semi-structured document for which the at least one linguistic constraint is satisfied within the at least one region, wherein the data structure identifies linguistic units identified for the at least one semi-structured document, wherein the identified linguistic units include grammatical units within a sentence,wherein the grammatical units include a noun phrase, a verb group, or both a noun phrase and a verb group,wherein the noun phrase comprises at least one noun and any modifier of the at least one noun, andwherein the verb group comprises at least one verb; and format results of the query by clustering together similar results, whether from the at least one semi-structured document or from multiple semi-structured documents.
地址 Cambridge GB