摘要 |
A method and apparatus for approximating a database statistic, such as the number of distinct values (NDV) is provided. To approximate the NDV for a portion of a table, a synopsis of distinct values is constructed. Each value in the portion is mapped to a domain of values. The mapping function is implemented with a uniform hash function, in one embodiment. If the resultant domain value does not exist in the synopsis, the domain value is added to the synopsis. If the synopsis reaches its capacity, a portion of the domain values are discarded from the synopsis. The statistic is approximated based on the number (N) of domain values in the synopsis and the portion of the domain that is represented in the synopsis relative to the size of the domain.
|