发明名称 Generating candidate entities using over frequent keys
摘要 Systems, methods, and computer program products are disclosed for matching an inbound identity record to existing entities. A composite generic key may be generated from multiple entity resolution (ER) candidate-building keys determined to be generic keys. A query may be generated based on the composite generic key and executed to retrieve candidate entities for an inbound identity record.
申请公布号 US8918394(B2) 申请公布日期 2014.12.23
申请号 US201213458805 申请日期 2012.04.27
申请人 International Business Machines Corporation 发明人 Allen Thomas B.;Macy Brian E.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Patterson & Sheridan, LLP 代理人 Patterson & Sheridan, LLP
主权项 1. A computer-implemented method, comprising: generating a plurality of distinct entity resolution (ER) candidate-building keys from a single identity record and without requiring user intervention, wherein each ER candidate-building key is generated based on a distinct field value contained in the single identity record; determining that at least two of the plurality of distinct ER candidate-building keys are each unsuitable for generating a restricted set of candidate entities against which to match the single identity record, by executing a respective query generated from the respective ER candidate-building key and that yields query results exceeding a predefined record count; upon determining that the at least two ER candidate-building keys are each unsuitable, generating a composite ER candidate-building key based on the at least two ER candidate-building keys and by operation of one or more computer processors, without requiring user intervention; executing a query generated based on the composite ER candidate-building key, in order to obtain the restricted set of candidate entities against which to match the single identity record, wherein the restricted set of candidate entities is selected from a plurality of available entities greater in number than the restricted set of candidate entities, wherein each entity in the restricted set of candidate entities represents a distinct individual and matches the composite ER candidate-building key; resolving the identity record by scoring the identity record against each candidate entity in the restricted set of candidate entities and to the exclusion of at least one available entity not in the restricted set of candidate entities; and maintaining a set of ER candidate-building keys determined as being unsuitable, including the at least two ER candidate-building keys, wherein the maintained set is subsequently used in order to facilitate determining an ER candidate-building key of another single identity record as being unsuitable.
地址 Armonk NY US