发明名称 Detecting relationships in unstructured text
摘要 Disclosed are embodiments of a system and a method for detecting relationships described in unstructured text-based electronic documents. The system and method incorporate the use of an input file that contains one or more text patterns that represent particular relationships. The text patterns each include regular text expressions that describe the particular relationship and slots for the location of each entity in that relationship. Document(s) are selected by a user and scanned by a proper noun tagger that identifies and tags every occurrence of proper names within the document(s). Then, a pattern matcher scans the document(s) to match text patterns. If a text pattern is matched within a document a relationship detector extracts all pairs of proper names found in the slots for each matched text pattern. The output from the relationship detector includes the names for each entity in the relationship, the type of relationship, and the identity of the document and the location of the sentence describing the relationship in the document.
申请公布号 US8001144(B2) 申请公布日期 2011.08.16
申请号 US20080056048 申请日期 2008.03.26
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 NOVAK JASMINE
分类号 G06F17/30;G06F17/27 主分类号 G06F17/30
代理机构 代理人
主权项
地址