发明名称 Data matching for column-oriented data tables
摘要 A computer-implemented method includes receiving a column-oriented table comprising data for a column family, wherein the data for the column family comprises column names and corresponding column values, receiving a set of anonymous column names for the column family, receiving a set of synonymous column names for the column family, determining a weighting for each column name that is not an anonymous column name based on the count or frequency of occurrence of the column name and the synonymous column names within the column-oriented table, and processing the column-oriented table with a probabilistic matching engine using the weighting for each column name. A corresponding computer program product and computer system are also disclosed herein.
申请公布号 US9529830(B1) 申请公布日期 2016.12.27
申请号 US201615008764 申请日期 2016.01.28
申请人 International Business Machines Corporation 发明人 Eshwar Bhavani K.;Naganna Soma Shekar;Ramakrishnan Umasuthan;Yellareddy Shashidhar R.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人 McDaniel Steven F.
主权项 1. A method, executed by one or more processors, the method comprising: receiving a column-oriented table comprising data for a column family, wherein the data for the column family comprises column names and corresponding column values; receiving a set of anonymous column names for the column family; receiving a set of synonymous column names for the column family; determining a weighting for each column name that is not an anonymous column name based on the count or frequency of occurrence of the column name and the synonymous column names within the column-oriented table; and processing the column-oriented table with a probabilistic matching engine using the weighting for each column name.
地址 Armonk NY US