发明名称 Identifying a set of candidate entities for an identity record
摘要 Systems, methods, and computer program products are disclosed for matching an inbound identity record to existing entities. A composite generic key may be generated from multiple entity resolution (ER) candidate-building keys determined to be generic keys. A query may be generated based on the composite generic key and executed to retrieve candidate entities for an inbound identity record.
申请公布号 US8918393(B2) 申请公布日期 2014.12.23
申请号 US201012893982 申请日期 2010.09.29
申请人 International Business Machines Corporation 发明人 Allen Thomas B.;Macy Brian E.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Patterson & Sheridan, LLP 代理人 Patterson & Sheridan, LLP
主权项 1. A computer program product comprising a non-transitory computer-readable medium having computer-readable program code embodied therewith, the computer-readable program code executable by one or more computer processors to: generate a plurality of distinct entity resolution (ER) candidate-building keys from a single identity record comprising a plurality of fields, wherein each ER candidate-building key is generated based on a distinct field value contained in the single identity record; determining that at least two of the plurality of ER candidate-building keys are each unsuitable for generating a restricted set of candidate entities against which to match the single identity record, by executing a respective query generated from the respective ER candidate-building key and that yields query results exceeding a distinct threshold count specific to the respective ER candidate-building key; upon determining that the at least two ER candidate-building keys are each unsuitable, generate a composite ER candidate-building key based on the at least two ER candidate-building keys; execute a query generated based on the composite ER candidate-building key, in order to obtain the restricted set of candidate entities against which to match the single identity record, wherein the restricted set of candidate entities is selected from a plurality of available entities greater in number than the restricted set of candidate entities, wherein each entity in the restricted set of candidate entities represents a distinct individual and matches the composite ER candidate-building key; resolve the identity record by scoring the identity record against each candidate entity in the restricted set of candidate entities and to the exclusion of at least one available entity not in the restricted set of candidate entities; upon successfully resolving the single identity record to a first candidate entity in the restricted set of candidate entities, update the first candidate entity to include the single identity record; upon unsuccessfully resolving the single identity record to any candidate entity in the restricted set of candidate entities, generate a new entity to include the single identity record; and maintain a set of ER candidate-building keys determined as being unsuitable, including the at least two ER candidate-building keys, wherein the maintained set is subsequently used in order to facilitate determining an ER candidate-building key of another single identity record as being unsuitable.
地址 Armonk NY US