发明名称 ESTIMATING MOST FREQUENT VALUES FOR A DATA SET
摘要 Provided are techniques for estimating most frequent values. A sample of values made up of rows is received from each of multiple nodes. The sample of values from each of the multiple nodes are aggregated to generate a sample table storing the rows. A descending list of most frequent values and associated frequencies is obtained using the sample table. Most frequent values are pruned from the descending list whose associated frequencies are below a minimum absolute frequency. The remaining most frequent values are extrapolated to reflect a data set.
申请公布号 US2016154805(A1) 申请公布日期 2016.06.02
申请号 US201615017442 申请日期 2016.02.05
申请人 International Business Machines Corporation 发明人 Finnerty James L.;Gopal Venkatesh S.;Tammisetti Venkannababu;To Paul-John A.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method, comprising: receiving, using a processor of a computer, from each of multiple nodes, a sample of values made up of rows; aggregating the sample of values from each of the multiple nodes to generate a sample table storing the rows; using the sample table to obtain a descending list of most frequent values and associated frequencies; pruning most frequent values from the descending list whose associated frequencies are below a minimum absolute frequency; and extrapolating remaining most frequent values to reflect a data set.
地址 Armonk NY US