发明名称 NATIVE-SCRIPT AND CROSS-SCRIPT CHINESE NAME MATCHING
摘要 Techniques for Chinese name matching are described. A Chinese name is received and is romanized into a Mandarin Pinyin representation. The Mandarin Pinyin representation of the Chinese name is matched against a set of Romanized Chinese names originating from several different Chinese character names. In response to finding a potential match between the Mandarin Pinyin representation and Romanized Chinese name, the original Chinese script for the Romanized Chinese name is retrieved. A native script comparison is applied between the received Chinese name and the original Chinese script for the Romanized Chinese name to obtain a match score. The native script comparison includes character-by-character comparison, character variant look-up, and/or consideration of name component misalignments. The obtained match score is used as a filter to reduce false positives that are generated in the matching of the Mandarin Pinyin representation against the set of Romanized Chinese names.
申请公布号 US2015046154(A1) 申请公布日期 2015.02.12
申请号 US201414491210 申请日期 2014.09.19
申请人 International Business Machines Corporation 发明人 Huang Shudong;King Nien C.
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项 1. A computer-implemented method for Chinese name matching, comprising: receiving, by a processor, a Chinese name; romanizing, by the processor, the received Chinese name into a Mandarin Pinyin representation; matching, by the processor, the Mandarin Pinyin representation of the Chinese name against a set of Romanized Chinese names, wherein the Romanized Chinese names originate from a plurality of different Chinese character names; in response to finding a potential match between the Mandarin Pinyin representation and a Romanized Chinese name, retrieving, by the processor, the original Chinese script for the Romanized Chinese name; and applying, by the processor, a native script comparison between the received Chinese name and the original Chinese script for the Romanized Chinese name as a filter to reduce false positives that are generated in the matching of the Mandarin Pinyin representation of the Chinese name against the set of Romanized Chinese names.
地址 Armonk NY US