发明名称 SYSTEM AND METHOD FOR CREATING AND MAINTAINING A DATABASE OF DISAMBIGUATED ENTITY MENTIONS AND RELATIONS FROM A CORPUS OF ELECTRONIC DOCUMENTS
摘要 Method and apparatus for creating an electronic database of disambiguated entity mentions and relations from a corpus of electronic documents. The invention automatically extracts from the corpus of electronic documents mentions about entities (e.g., references to people, organizations or places), parses the entity mentions into "mention objects," and executes a series of grouping, comparison and hierarchical fuzzy object clustering algorithms to cluster together in an electronic database all of the mention objects referring to the same entity and all of the mention objects (e.g. "people") associated with each other by a relationship (e.g., "co-authors" or "family members"). The resulting electronic database of disambiguated entity mentions and relations, which may comprise, for example, an XML document, a relational database or hierarchical database, is structured to permit useful recordation, access, review and display of all of the mentions and relations associated with a particular entity or collection of entities.
申请公布号 CA2819066(C) 申请公布日期 2014.03.25
申请号 CA20112819066 申请日期 2011.08.10
申请人 COMSORT, INC. 发明人 WOYTOWITZ, MICHAEL A.;HAWKS, MARSHALL WELLS
分类号 G06F17/30;G06F17/20 主分类号 G06F17/30
代理机构 代理人
主权项
地址