发明名称 Estimating error propagation for database optimizers
摘要 Techniques are disclosed to determine error propagation for a query optimizer component of a database management system for a database. A database query is received that specifies one or more query conditions. Measures of actual and estimated selectivity of the one or more query conditions are determined. A measure of estimated deviation between the measures of actual and estimated selectivity is determined. A query execution plan is generated or selected based on the measure of estimated deviation.
申请公布号 US9251213(B2) 申请公布日期 2016.02.02
申请号 US201313843657 申请日期 2013.03.15
申请人 International Business Machines Corporation 发明人 Singh Harpreet
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Patterson & Sheridan, LLP 代理人 Patterson & Sheridan, LLP
主权项 1. A computer-implemented method to determine error propagation for a query optimizer component of a database management system (DBMS) for a database, the method comprising: receiving, from a requesting entity, a database query specifying one or more query conditions to apply to at least a first database column when querying on or more database tables; determining a measure of actual selectivity of the one or more query conditions specified in the received database query wherein the measure of actual selectivity comprises a predefined function of at least: (i) a total count of rows from the one or more database tables before applying the one or more query conditions; and (ii) a total count of rows returned from the one or more database tables as a result of applying the one or more query conditions; determining a measure of estimated selectivity of the one or more query conditions specified in the received database query, wherein the measure of estimated selectivity comprises a predefined function of at least two of: (i) the total count of rows from the one or more database tables before applying the one or more query conditions; (ii) a total count of null values in the first database column; (iii) a total count of instances count of distinct values represented in a frequency table associated with the first database table; and (v) a total count of distinct values in the first database column; determining a measure of estimated deviation between the determined measure of actual selectivity and the determined measure of estimated selectivity; and executing the database query based on the determined measure of eatimated deviation and by operation of one or more computer processors in order to generate a set of results, wherein the set of results is returned to the requesting entity.
地址 Armonk NY US