发明名称 METHOD FOR ENHANCING RECORD LINKAGE PRODUCTION DATA QUALITY
摘要 A Record Linkage Production Data Quality (RLPDQ) tool provides an independent record linkage system producing comparative results with a production record linkage system and an efficient arbitration operation to resolve respective confusion matrices for the two systems. The tool can be used for enhancing the output of record linkage engines, for merging different files into a third file containing expanded descriptions of common entities in both files, and for making testable improvements to a record linkage engine.
申请公布号 US2015269486(A1) 申请公布日期 2015.09.24
申请号 US201514659978 申请日期 2015.03.17
申请人 ADI, LLC 发明人 Paxton K. Bradley
分类号 G06N5/02;G06F17/30 主分类号 G06N5/02
代理机构 代理人
主权项 1. A method of enhancing the performance of a production record linkage engine under control of a processor configured with executable instructions for establishing first comparative links between individual records that empirically describe a person or thing in different electronically encoded files, the first comparative links including predicted positive matches between some of the individual records in the different files and predicted negative matches between other of the individual records in the different files, comprising steps of: establishing second comparative links between the individual records of the different electronically encoded files with an independent record linkage engine under the control of a processor configured with different executable instructions from the executable instructions of the processor of the production record linkage engine, the second comparative links including predicted positive matches between some of the individual records in the different files and predicted negative matches between other of the individual records in the different files, using a processor configured with additional executable instructions to identify the predicted positive matches in common among the first and second comparative links as true positive matches, performing an arbitration to identify additional true positive matches from among at least one of: (a) the predicted positive matches of the first comparative links that correspond to predicted negative matches of the second comparative links and(b) the predicted positive matches of the second comparative links that correspond to predicted negative matches of the first comparative links, while substantially avoiding similar arbitration from among the predicted negative matches of the first comparative links that correspond to predicted negative matches of the second comparative links, and revamping the first comparative links established by the production record linkage system by at least one of: (a) excluding from revamped predicted positive matches predicted positive matches of the first comparative links that are not among the true positive matches and(b) including within the revamped predicted positive matches predicted negative matches of the first comparative links that are among the true positive matches, wherein the individual records in the different electronically encoded files include respective record addresses, and the revamped first comparative links include electronically encoded record links between the record addresses of the revamped predicted positive matches in the two different files.
地址 Rochester NY US