发明名称 PREDICTING WHETHER STRINGS IDENTIFY A SAME SUBJECT
摘要 The present invention is directed to predicting whether two character strings refer to a same subject. An exemplary embodiment includes using a set of character-string pairs, which have been identified as either matches or nonmatches, to learn a function. The function can then be applied to the two character strings to quantify a likelihood that they refer to the same subject matter. For example, a kernel-based classifier analyzes the set of character-string pairs using a kernel function. Based on the analysis the classifier can generate parameters. The parameters are usable to define a prediction algorithm that when applied to the two character strings generates a prediction value, which suggests whether the two characters are matches, i.e., refer to the same subject matter.
申请公布号 US2010306148(A1) 申请公布日期 2010.12.02
申请号 US20090474258 申请日期 2009.05.28
申请人 MICROSOFT CORPORATION 发明人 JOHNSTON CAROLYN P.
分类号 G06N5/04;G06F15/18 主分类号 G06N5/04
代理机构 代理人
主权项
地址