摘要 |
The system, method, and program of this invention collects multi-column statistics, by a database management system, to reflect a relationship among multiple columns of a table in a relational database. These statistics are stored in the system catalog, and are used during query optimization to obtain an estimate of the number of qualifying rows when a query has predicates on multiple columns of a table.A multi-column linear quantile statistic is collected by dividing the data of multiple columns into sub-ranges where each sub-range has approximately an even distribution of data, and determining a frequency and cardinality of each sub-range. A multi-column polygonal quantile statistic is collected by dividing the data of multiple columns into sub-spaces where each sub-space contains approximately the same number of tuples, and determining a frequency and cardinality of each sub-space.The system catalog is accessed for the stored multi-column linear quantile statistic for a query having a single range predicate and at least one equal predicate to determine the selectivity value for the predicates of the query. The system catalog is accessed for the stored multi-column polygonal quantile statistic for a query having more than one range predicate. These statistics are used in various ways to determine the selectivity value for the predicates of the query.
|