摘要 |
A method and system to determine a q, θ-optimal histogram comprising a plurality of buckets over a data distribution where for any cardinality estimate made using the histogram the cardinality estimate is constrained to obey an acceptability criteria parameterized by q and θ that bounds a ratio error between the cardinality estimate and a true value of the cardinality, q being a factor by which the estimate deviates, at most, from a true value of the cardinality and θ being a threshold value which the cardinality does not exceed, wherein a maximum number of possible query intervals generated in determining the acceptability of the q, θ-optimal histogram is less than quadratic in the number of values. |
主权项 |
1. A computer-implemented method of optimizing execution of a query that accesses data by a computer, the method comprising:
generating, by a query sever having a processor, a cardinality estimate using a q, θ-optimal histogram comprising a plurality of buckets over a data distribution where, for any cardinality estimate made using the histogram, the cardinality estimate is constrained to obey an acceptability criteria parameterized by q and θ that bounds a ratio error between the cardinality estimate and a true value of the cardinality, q being a factor by which the estimate deviates, at most, from a true value of the cardinality and θ being a threshold value which the cardinality estimate does not exceed, wherein a maximum number of possible query intervals generated in determining the acceptability of the q, θ-optimal histogram is less than quadratic in the number of values for the maximum number of possible query intervals; using the cardinality estimation generated based on the q, θ-optimal histogram that obeys the acceptability criteria parameterized by q and θ to determine, by the processor of the query server, an optimal query plan for executing the query; and producing an output of the optimal query plan, wherein a bucket in the plurality of buckets is constructed by incrementally extending a smaller q, θ-acceptable bucket without need to test the new bucket for q, θ-violations whose intervals are wholly contained within the smaller bucket. |