发明名称 Method and system to produce and train composite similarity functions for product normalization
摘要 A method and system to produce and train composite similarity functions for record linkage problems, including product normalization problems, is disclosed. In one embodiment, for a group of products in a plurality of products, a composite similarity function is constructed for the group of products from a weighted set of basis similarity functions. Training records are used to calculate the weights in the weighted set of basis similarity functions in the composite similarity function for the group of products. In another embodiment, a composite similarity function is applied to pairs of training records. The application of the composite similarity function provides a number that can be used to indicate whether two records relate to a common subject. The composite similarity function includes a weighted set of basis similarity functions. A perceptron algorithm is used to modify the weights in the weighted set.
申请公布号 US7702631(B1) 申请公布日期 2010.04.20
申请号 US20060376503 申请日期 2006.03.14
申请人 GOOGLE INC. 发明人 BASU SUGATO;BILENKO MIKHAIL;SAHAMI MEHRAN
分类号 G06F17/30;G06F7/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址