发明名称 METHODS AND SYSTEMS FOR MATCHING RECORDS AND NORMALIZING NAMES
摘要 Methods and systems are provided for normalizing strings and for matching records. In one implementation, a string is tokenized into components. Sequences of tags are generated by assigning tags to the components. A sequence of states is determined based on the sequences of tags. A normalized string is generated by normalizing the sequence of the states. A key record including key fields is extracted from a first data source. A candidate record including candidate fields is extracted from a second data source. A numerical record including numerical fields is computed by comparing the key fields and the candidate fields using comparison functions. Matching functions determined by an additive logistic regression method are applied to the numerical fields. Whether the key record and the candidate record are a match is determined based on a sum of results of the matching functions.
申请公布号 US2010198756(A1) 申请公布日期 2010.08.05
申请号 US20090363057 申请日期 2009.01.30
申请人 ZHANG LING QIN;WASSON MARK;TEMPLAR VALENTINA 发明人 ZHANG LING QIN;WASSON MARK;TEMPLAR VALENTINA
分类号 G06F15/18;G06F17/30;G06N5/02 主分类号 G06F15/18
代理机构 代理人
主权项
地址