摘要 |
A probabilistic data structure is generated for efficient query processing using a histogram for unsorted data in a column of a columnar database. A bucket range size is determined for multiples buckets of a histogram of a column in a columnar database table. In at least some embodiments, the histogram may be a height-balanced histogram. A probabilistic data structure is generated to indicate for which particular buckets in the histogram there is a data value stored in the data block. When an indication of a query directed to the column for select data is received, the probabilistic data structure for each of the data blocks storing data for the column may be examined to determine particular ones of the data blocks which do not need to be read in order to service the query for the select data. |