发明名称 |
Estimating a number of unique values in a list |
摘要 |
A method determines a number of unique values in a sample of a list of values and estimates a number of the unique values for an unsampled portion of the list of values. The method estimates a number of the unique values in the list by adding the number of unique values in the sample to the number of the unique values in the unsampled portion. |
申请公布号 |
US9158815(B2) |
申请公布日期 |
2015.10.13 |
申请号 |
US201012907325 |
申请日期 |
2010.10.19 |
申请人 |
Hewlett-Packard Development Company, L.P. |
发明人 |
Lakshminarayan Choudur;Hill Joe Robert |
分类号 |
G06F7/00;G06F17/30;G06F17/18 |
主分类号 |
G06F7/00 |
代理机构 |
|
代理人 |
Dryja Michael A. |
主权项 |
1. A method executed by a computer system, comprising:
storing, in the computer system, a list of values; determining, with the computer system, a number of unique values and a frequency of the unique values for a sample of the list of values; estimating, with the computer system, a number of unique values and a frequency of the unique values for an unsampled portion of the list of values, by adding a value based on a distribution family for the frequency of the unique values for the sample to a value based on a conditional probability distribution family for the frequency of the unique values in the unsampled portion given the frequency of the unique values for the sample; and estimating, with the computer system, a number of unique values in the list by adding the determined number of unique values in the sample to the estimated number of unique values in the unsampled portion. |
地址 |
Houston TX US |