发明名称 Determining similarity of unfielded names using feature assignments
摘要 Provided are techniques for comparing names. A first phrase score is obtained by comparing a name phrase in a first name to a name phrase in a second name. A second phrase score is obtained by comparing another name phrase in the first name to another name phrase in the second name. An overall score is generated based on the obtained first phrase score and the obtained second phrase score. The overall score is updated based on comparing features of the first name with features of the second name.
申请公布号 US9229926(B2) 申请公布日期 2016.01.05
申请号 US201213692798 申请日期 2012.12.03
申请人 International Business Machines Corporation 发明人 Patman Maguire Frankie E.
分类号 G06F17/27 主分类号 G06F17/27
代理机构 Konrad, Raynes, Davda & Victor LLP 代理人 Davda Janaki K.;Konrad, Raynes, Davda & Victor LLP
主权项 1. A computer program product for comparing names, the computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith, wherein the non-transitory computer readable program code, when executed by at least one processor of a computer, is configured to perform: receiving, into the computer, a first name for comparison; for each pair of names including the first name and a second name selected from a set of names: identifying features of the first name that comprise semantic relationships that include information about cultural derivation, hierarchical relationships that describe dependency relationships between multiple words that form a first name phrase within the first name, position information for each of the words in the first name phrase, and position information for name phrases in the first name;identifying features of the second name that comprise semantic relationships that include information about cultural derivation, hierarchical relationships that describe dependency relationships between multiple words that form a second name phrase within the second name, position information for each of the words in the second name phrase, and position information for name phrases in the second name;obtaining a first phrase score by comparing the first name phrase in the first name to the second name phrase in the second name;obtaining a second phrase score by comparing another name phrase in the first name to another name phrase in the second name;generating an overall score for the pair of names based on the obtained first phrase score and the obtained second phrase score; andupdating the overall score for the pair of names based on comparing the features of the first name with the features of the second name; and providing, with the computer, a list of the second names from the set of second names that matched the first name in order of the updated score for each pair of names.
地址 Armonk NY US