发明名称 |
Estimating error propagation for database optimizers |
摘要 |
Techniques are disclosed to determine error propagation for a query optimizer component of a database management system for a database. A database query is received that specifies one or more query conditions. Measures of actual and estimated selectivity of the one or more query conditions are determined. A measure of estimated deviation between the measures of actual and estimated selectivity is determined. A query execution plan is generated or selected based on the measure of estimated deviation. |
申请公布号 |
US9251213(B2) |
申请公布日期 |
2016.02.02 |
申请号 |
US201313843657 |
申请日期 |
2013.03.15 |
申请人 |
International Business Machines Corporation |
发明人 |
Singh Harpreet |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Patterson & Sheridan, LLP |
代理人 |
Patterson & Sheridan, LLP |
主权项 |
1. A computer-implemented method to determine error propagation for a query optimizer component of a database management system (DBMS) for a database, the method comprising:
receiving, from a requesting entity, a database query specifying one or more query conditions to apply to at least a first database column when querying on or more database tables; determining a measure of actual selectivity of the one or more query conditions specified in the received database query wherein the measure of actual selectivity comprises a predefined function of at least: (i) a total count of rows from the one or more database tables before applying the one or more query conditions; and (ii) a total count of rows returned from the one or more database tables as a result of applying the one or more query conditions; determining a measure of estimated selectivity of the one or more query conditions specified in the received database query, wherein the measure of estimated selectivity comprises a predefined function of at least two of: (i) the total count of rows from the one or more database tables before applying the one or more query conditions; (ii) a total count of null values in the first database column; (iii) a total count of instances count of distinct values represented in a frequency table associated with the first database table; and (v) a total count of distinct values in the first database column; determining a measure of estimated deviation between the determined measure of actual selectivity and the determined measure of estimated selectivity; and executing the database query based on the determined measure of eatimated deviation and by operation of one or more computer processors in order to generate a set of results, wherein the set of results is returned to the requesting entity. |
地址 |
Armonk NY US |