发明名称 PROBABILISTIC RECORD LINKING
摘要 Probabilistic record linking methods and a system are provided. Selections are acquired; the selections identify the two data sources, column identifiers from each of the two data sources, pairs of column identifiers from each of the two data sources, a confidence values for matching each record associated with each pair. The selections are used to compare data housed in the two data sources. Based on the comparison, matched records and non matched records are identified from the two data sources.
申请公布号 US2014280274(A1) 申请公布日期 2014.09.18
申请号 US201414177702 申请日期 2014.02.11
申请人 Teradata US, Inc. 发明人 Louis Anand;Saket Shashank
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method, comprising: mapping, by a processor, master column identifiers to target column identifiers, each mapping identifying a unique pair for a particular master column identifier to a particular target column identifier; acquiring, by the processor, a match confidence value and a no-match confidence value for each mapping pair; comparing, by the processor, each mapping pair with its corresponding master data in a master data source to target data in a target data source using each mapping pair's match confidence value and non-match confidence value; and generating, by the processor, matched records in a matched pool, non-matched records from the master data source in a non-matched pool, and potential matched records a potential matched pool based on the comparison.
地址 Dayton OH US