发明名称 Distributed reservoir sampling for web applications
摘要 Random samples without replacement are extracted from a distributed set of items by leveraging techniques for aggregating sampled subsets of the distributed set. This provides a uniform random sample without replacement representative of the distributed set, allowing statistical information to be gleaned from extremely large sets of distributed information. Subset random samples without replacement are extracted from independent subsets of the distributed set of items. The subset random samples are then aggregated to provide a uniform random sample without replacement of a fixed size that is representative of a distributed set of items of unknown size. In one instance, a multivariate hyper-geometric distribution is sampled by breaking up the multivariate hyper-geometric distribution into a set of univariate hyper-geometric distributions. Individual items of a uniform random sample without replacement are then determined utilizing a normal approximation of the univariate hyper-geometric distributions and a finite population correction factor.
申请公布号 US2007050357(A1) 申请公布日期 2007.03.01
申请号 US20050212301 申请日期 2005.08.26
申请人 MICROSOFT CORPORATION 发明人 CHICKERING DAVID M.;ROY ASHIS K.;MEEK CHRISTOPHER A.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址