摘要 |
For use with a database that accommodates distinct value queries having predicates, a distinct sampling system and a method of distinct sampling. In one embodiment, the distinct sampling system includes a scanning subsystem that is configured to scan each row in the database for a distinct target attribute, employ a hash function to map the distinct target attribute to an attribute priority level, maintain random samples of each row based on a sample priority level and a sample size, and produce a distinct sample therefrom. The distinct sampling system further includes a distinct query estimator that is configured to receive the distinct value queries, cause the distinct value queries to be executed on the distinct sample to retrieve a result, and adjust the result to produce a distinct estimate therefrom.
|