发明名称 Single pass space efficient system and method for generating an approximate quantile in a data set having an unknown size
摘要 A space-efficient system and method for generating an approximate phi-quantile data element of a data set in a single pass over the data set, without a priori knowledge of the size of the data set. The approximate phi-quantile is guaranteed to lie within a user-specified approximation error epsi of the true quantile being sought with a probability of at least 1-delta, with delta being a user-defined probability of failure. B buffers, each having a capacity of k elements, initially are filled with elements from the data set, with the values of b and k depending on approximation error e and the probability delta. The buffers are then collapsed into an output buffer, with the remaining buffers then being refilled with elements, collapsed (along with the previous output buffer), and so on until the entire data set has been processed and a single output remains. The element of the output corresponding to the approximate quantile is then output as the approximate quantile. In later iterations (when the height of the tree is at least equal to a predetermined height that depends on delta and epsi), the data is sampled non-uniformly to populate the buffers to render the desired performance. Parallel processors can be used, with the final output buffers of the processors being sent to a collecting processor P0 as input buffers to the collecting processor P0.
申请公布号 US6343288(B1) 申请公布日期 2002.01.29
申请号 US19990268089 申请日期 1999.03.12
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 LINDSAY BRUCE GILBERT;MANKU GURMEET SINGH;RAJAGOPALAN SRIDHAR
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址