发明名称 Estimating a number of unique values in a list
摘要 A method determines a number of unique values in a sample of a list of values and estimates a number of the unique values for an unsampled portion of the list of values. The method estimates a number of the unique values in the list by adding the number of unique values in the sample to the number of the unique values in the unsampled portion.
申请公布号 US9158815(B2) 申请公布日期 2015.10.13
申请号 US201012907325 申请日期 2010.10.19
申请人 Hewlett-Packard Development Company, L.P. 发明人 Lakshminarayan Choudur;Hill Joe Robert
分类号 G06F7/00;G06F17/30;G06F17/18 主分类号 G06F7/00
代理机构 代理人 Dryja Michael A.
主权项 1. A method executed by a computer system, comprising: storing, in the computer system, a list of values; determining, with the computer system, a number of unique values and a frequency of the unique values for a sample of the list of values; estimating, with the computer system, a number of unique values and a frequency of the unique values for an unsampled portion of the list of values, by adding a value based on a distribution family for the frequency of the unique values for the sample to a value based on a conditional probability distribution family for the frequency of the unique values in the unsampled portion given the frequency of the unique values for the sample; and estimating, with the computer system, a number of unique values in the list by adding the determined number of unique values in the sample to the estimated number of unique values in the unsampled portion.
地址 Houston TX US