发明名称 CREATING AN ONTOLOGY ACROSS MULTIPLE SEMANTICALLY-RELATED DATA SETS
摘要 Embodiments presented herein disclose techniques for generating an entity pool, a hierarchical structure of related nodes that assists with classification and comparison of dissimilar data sets. To generate the entity pool, text references and metadata are collected from a public source, such as an online encyclopedia or other text source that provides dense and structured data that focuses on identified terminology. The text references are assigned similarity scores based on contextual information provided by the metadata. The text references are clustered into nodes based on similarity. Relationships between the nodes are defined based on edges generated between the nodes.
申请公布号 US2015178372(A1) 申请公布日期 2015.06.25
申请号 US201314134741 申请日期 2013.12.19
申请人 OpenGov, Inc. 发明人 SEAL Matthew
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer-implemented method for generating an entity pool that maps elements from multiple hierarchies to a plurality of nodes, the method comprising: identifying, by operating of one or more computer processors, a first plurality of mentions and metadata, wherein each mention comprises a text string and wherein the metadata comprises hierarchical information about a corresponding mention; grouping mentions based on a first measure of similarity; generating, for each group of mentions, a node in an entity pool; and identifying relationships between one or more pairs of nodes in the entity pool based on the mentions stored by each node of a given pair of the nodes.
地址 Mountain View CA US