发明名称 METHODS AND SYSTEMS FOR MATCHING RECORDS AND NORMALIZING NAMES
摘要 Methods and systems are provided for normalizing strings and for matching records. In one implementation, a string is tokenized into components. Sequences of tags are generated by assigning tags to the components. A sequence of states is determined based on the sequences of tags. A normalized string is generated by normalizing the sequence of the states. A key record including key fields is extracted from a first data source. A candidate record including candidate fields is extracted from a second data source. A numerical record including numerical fields is computed by comparing the key fields and the candidate fields using comparison functions. Matching functions determined by an additive logistic regression method are applied to the numerical fields. Whether the key record and the candidate record are a match is determined based on a sum of results of the matching functions.
申请公布号 CA2750609(A1) 申请公布日期 2010.08.05
申请号 CA20102750609 申请日期 2010.01.14
申请人 LEXISNEXIS GROUP 发明人 ZHANG, LING QIN;WASSON, MARK;TEMPLAR, VALENTINA
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址