发明名称 Semantic filtering in data matching
摘要 A computer-implemented method for finding related datasets includes, for each reference dataset from multiple reference datasets, determining domains and geographies for a user dataset and the reference dataset, obtaining a weighted domain coefficient and a weighted geography coefficient using the determined domains and geographies for the user dataset and the reference dataset, calculating a correlation coefficient between the user dataset and the reference dataset and calculating a semantic filtering coefficient for the user dataset and the reference dataset using the calculated correlation coefficient, the weighted domain coefficient and the weighted geography coefficient.
申请公布号 US9563664(B2) 申请公布日期 2017.02.07
申请号 US201414581246 申请日期 2014.12.23
申请人 Business Objects Software, Ltd. 发明人 Potiagalov Alexei
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Brake Hughes Bellermann LLP 代理人 Brake Hughes Bellermann LLP
主权项 1. A system for finding related datasets, the system comprising: at least one processor; a non-transitory computer-readable medium configured to store executable instructions that when executed by the at least one processor are configured to implement: a user dataset; one or more reference datasets; a domain weight module containing a list of domains having weighted domain coefficients; a geography weight module containing a list of geographies having weighted geography coefficients; and a semantic filtering module that, for each reference dataset, is configured to: determine domains and geographies for the user dataset and the reference dataset;obtain a weighted domain coefficient from the domain weight module and a weighted geography coefficient from the geography weight module using the determined domains and geographies for the user dataset and the reference dataset;calculate a correlation coefficient between the user dataset and the reference dataset; andcalculate a semantic filtering coefficient for the user dataset and the reference dataset using the calculated correlation coefficient, the weighted domain coefficient and the weighted geography coefficient.
地址 Dublin IE