发明名称 |
ESTIMATING MOST FREQUENT VALUES FOR A DATA SET |
摘要 |
Provided are techniques for estimating most frequent values. A sample of values made up of rows is received from each of multiple nodes. The sample of values from each of the multiple nodes are aggregated to generate a sample table storing the rows. A descending list of most frequent values and associated frequencies is obtained using the sample table. Most frequent values are pruned from the descending list whose associated frequencies are below a minimum absolute frequency. The remaining most frequent values are extrapolated to reflect a data set. |
申请公布号 |
US2016154805(A1) |
申请公布日期 |
2016.06.02 |
申请号 |
US201615017442 |
申请日期 |
2016.02.05 |
申请人 |
International Business Machines Corporation |
发明人 |
Finnerty James L.;Gopal Venkatesh S.;Tammisetti Venkannababu;To Paul-John A. |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method, comprising:
receiving, using a processor of a computer, from each of multiple nodes, a sample of values made up of rows; aggregating the sample of values from each of the multiple nodes to generate a sample table storing the rows; using the sample table to obtain a descending list of most frequent values and associated frequencies; pruning most frequent values from the descending list whose associated frequencies are below a minimum absolute frequency; and extrapolating remaining most frequent values to reflect a data set. |
地址 |
Armonk NY US |