发明名称 RESOLUTION OF DATA INCONSISTENCIES
摘要 Examples disclosed herein enable identifying a feature that is common to a first dataset and a second dataset, wherein a first value of the feature in the first dataset is different from a second value of the feature in the second dataset; determining a first predicted value of the feature in the first dataset based on a second dataset classifier trained on the second dataset; determining a second predicted value of the feature in the second dataset based on a first dataset classifier trained on the first dataset; determining a first similarity score between the first value and the first predicted value; determining a second similarity score between the second value and the second predicted value; and generating a bipartite graph that comprises a first node indicating the first value, a second node indicating the second value, and an edge indicating the first or second similarity score.
申请公布号 US2016147799(A1) 申请公布日期 2016.05.26
申请号 US201414554418 申请日期 2014.11.26
申请人 Hewlett-Packard Development Company, L.P. 发明人 Cohen Ira;Gelberg Mor;Egozi Levi Efrat
分类号 G06F17/30;G06N99/00 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for execution by a computing device for resolving data inconsistencies, the method comprising: identifying a feature that is common to a first dataset and a second dataset, wherein a first value of the feature in the first dataset is different from a second value of the feature in the second dataset; determining a first predicted value of the feature in the first dataset based on a second dataset classifier trained on the second dataset; determining a second predicted value of the feature in the second dataset based on a first dataset classifier trained on the first dataset; determining a first similarity score between the first value and the first predicted value; determining a second similarity score between the second value and the second predicted value; and generating a bipartite graph that comprises a first node indicating the first value, a second node indicating the second value, and an edge indicating the first or second similarity score.
地址 Houston TX US