发明名称 Supplementing structured information about entities with information from unstructured data sources
摘要 A method for supplementing structured information within a data system for entities based on unstructured data analyzes a document with unstructured data and extracts attribute values from the unstructured data for one or more entities of the data system. Entity records with structured information are retrieved from the data system based on the extracted attribute values. Entity references for corresponding entities of the data system are constructed based on a comparison of the retrieved entity records and the extracted attribute values. The entity references are linked to the corresponding entities within the data system, with the entity references including extracted attributes from the unstructured data for corresponding linked entities.
申请公布号 US9251182(B2) 申请公布日期 2016.02.02
申请号 US201313798229 申请日期 2013.03.13
申请人 International Business Machines Corporation 发明人 Deshpande Prasad M.;Mohania Mukesh K.;Murthy Karin;Padmanabhan Deepak S.;Reed Jennifer S.;Schumacher Scott
分类号 G06F17/30;G06F15/16 主分类号 G06F17/30
代理机构 SVL IPLaw Edell, Shapiro & Finnan, LLC 代理人 Carroll Terry;SVL IPLaw Edell, Shapiro & Finnan, LLC
主权项 1. A computer-implemented method of supplementing structured information within a data system for entities based on unstructured data comprising: analyzing documents with unstructured data specifying two or more entities of the structured information and interactions between those two or more entities; identifying from the interactions within the unstructured data of the documents one or more relationships between entities of the structured information; extracting attribute values from the unstructured data for one or more entities of the structured information base on a comparison of the unstructured data with one or more dictionaries each including values for a corresponding attribute of an entity within the data system, wherein extracting attribute values from the unstructured data includes: generating tokens from the unstructured data and comparing the tokens to the values within the one or more dictionaries, wherein at least one value within a dictionary includes a plurality of tokens; retrieving entity records with structured information form the data system based on the extracted attribute values;constructing entity references for corresponding one or more entities of the data system based on a comparison of the retrieved entity records and the extracted attribute values; linking the entity references to the corresponding one or more entities within the data system to supplement the structured information for the corresponding one or more entities with information extracted from the unstructured data, wherein the entity references include extracted attributes from the unstructured data for corresponding linked entities; and linking entities of the structured information to each other within the structured information to indicate related entities based on the one or more relationships between those entities identified form the interactions specified within the unstructured data of the documents.
地址 Armonk NY US