发明名称 IDENTIFYING ENTITY MAPPINGS ACROSS DATA ASSETS
摘要 Entity mappings that produce matching entities for a first data asset having attributes and a second data asset having attributes are generated by: generating entity mappings that produce matching entities for a first data asset having attributes with attribute values and a second data asset having attributes with attribute values by: matching the attribute values of the attributes of the first data asset with the attribute values of the attributes of the second data asset, using the matching attribute values to generate matching attribute pairs, and using the matching attribute pairs to identify entity mappings; computing an entity mapping score for each of the entity mappings based on a combination of factors; ranking the entity mappings based on each entity mapping score; and using some of the ranked entity mappings to determine whether a same real-world entity is described by the first data asset and the second data asset.
申请公布号 US2017075898(A1) 申请公布日期 2017.03.16
申请号 US201615268400 申请日期 2016.09.16
申请人 International Business Machines Corporation 发明人 Deshpande Prasad M.;Dey Atreyee;Gupta Rajeev;Gupta Sanjeev K.;Joshi Salil;Padmanabhan Sriram K.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method, comprising: generating entity mappings that produce matching entities for a first data asset having attributes with attribute values and a second data asset having attributes with attribute values by: matching the attribute values of the attributes of the first data asset with the attribute values of the attributes of the second data asset;using the matching attribute values to generate matching attribute pairs; andusing the matching attribute pairs to identify entity mappings; computing an entity mapping score for each of the entity mappings based on a combination of factors; ranking the entity mappings based on each entity mapping score; and using the ranked entity mappings to determine which of the entity mappings are to be used to determine whether a same real-world entity is described by the first data asset and the second data asset.
地址 Armonk NY US