发明名称 |
Semantic filtering in data matching |
摘要 |
A computer-implemented method for finding related datasets includes, for each reference dataset from multiple reference datasets, determining domains and geographies for a user dataset and the reference dataset, obtaining a weighted domain coefficient and a weighted geography coefficient using the determined domains and geographies for the user dataset and the reference dataset, calculating a correlation coefficient between the user dataset and the reference dataset and calculating a semantic filtering coefficient for the user dataset and the reference dataset using the calculated correlation coefficient, the weighted domain coefficient and the weighted geography coefficient. |
申请公布号 |
US9563664(B2) |
申请公布日期 |
2017.02.07 |
申请号 |
US201414581246 |
申请日期 |
2014.12.23 |
申请人 |
Business Objects Software, Ltd. |
发明人 |
Potiagalov Alexei |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Brake Hughes Bellermann LLP |
代理人 |
Brake Hughes Bellermann LLP |
主权项 |
1. A system for finding related datasets, the system comprising:
at least one processor; a non-transitory computer-readable medium configured to store executable instructions that when executed by the at least one processor are configured to implement: a user dataset; one or more reference datasets; a domain weight module containing a list of domains having weighted domain coefficients; a geography weight module containing a list of geographies having weighted geography coefficients; and a semantic filtering module that, for each reference dataset, is configured to:
determine domains and geographies for the user dataset and the reference dataset;obtain a weighted domain coefficient from the domain weight module and a weighted geography coefficient from the geography weight module using the determined domains and geographies for the user dataset and the reference dataset;calculate a correlation coefficient between the user dataset and the reference dataset; andcalculate a semantic filtering coefficient for the user dataset and the reference dataset using the calculated correlation coefficient, the weighted domain coefficient and the weighted geography coefficient. |
地址 |
Dublin IE |