发明名称 Internal linking co-convergence using clustering with no hierarchy
摘要 Certain implementations of the disclosed technology include systems and methods for linking entities in an internal database by utilizing co-convergence and clustering. The method may include clustering database records into a first set of clusters having corresponding first cluster identifications (IDs). The clustering may be based at least in part on determining similarity among corresponding field values. The method may include associating mutually matching database records, by performing at least one matching iteration for each of the database records. The method may include determining similarity among corresponding field values of the database records, re-clustering at least a portion of the database records into a second set of clusters, the re-clustering based at least in part on the associating mutually matching database records and on the determining similarity among corresponding field values of the database records.
申请公布号 US9043359(B2) 申请公布日期 2015.05.26
申请号 US201314029698 申请日期 2013.09.17
申请人 LEXISNEXIS RISK SOLUTIONS FL INC. 发明人 Bayliss David Alan
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Troutman Sanders LLP 代理人 Troutman Sanders LLP ;Schneider Ryan A.;Jones Mark Lehi
主权项 1. A computer-implemented method comprising: clustering database records into a first set of clusters having corresponding first cluster identifications (IDs), each database record comprising one or more field values, wherein the clustering is based at least in part on determining similarity among corresponding field values of the database records; associating mutually matching database records, wherein the associating comprises performing at least one matching iteration for each of the database records, wherein the matching iteration is based at least in part on the first cluster IDs; determining similarity among corresponding field values of the database records; re-clustering at least a portion of the database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the associating mutually matching database records and on the determining similarity among corresponding field values of the database records; and outputting database record information, based at least in part on the re-clustering.
地址 Boca Raton FL US