发明名称 Unique value calculation in partitioned tables
摘要 An estimation algorithm can generate a uniqueness metric representative of data in a database table column that is split across a plurality of data partitions. The column can be classified as categorical if the uniqueness metric is below a threshold and as non-categorical if the uniqueness metric is above the threshold. A first estimation factor can be assigned to the column if the column is classified as categorical or a larger second estimation factor can be assigned if the column is non-categorical. A cost estimate for system resources required to perform a database operation on the database table can be calculated. The cost estimate can include an estimated total number of distinct values in the column across all of the plurality of data partitions determined using the assigned first estimation factor or second estimation factor and a number of rows in the table as inputs to an estimation function.
申请公布号 US8880510(B2) 申请公布日期 2014.11.04
申请号 US201113336928 申请日期 2011.12.23
申请人 SAP SE 发明人 Fricke Lars;Hwang Sangyong
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Mintz Levin Cohn Ferris Glovsky and Popeo, P.C. 代理人 Mintz Levin Cohn Ferris Glovsky and Popeo, P.C.
主权项 1. A computer program product comprising a non-transitory machine-readable medium storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising: applying an estimation algorithm to generate a uniqueness metric representative of data in a column of a database table, the column being split across a plurality of data partitions; classifying the column as categorical or non-categorical by comparing the uniqueness metric to a threshold, wherein the column is classified as categorical if the uniqueness metric is less than the threshold, and the column is classified as non-categorical if the uniqueness metric is equal to or greater than the threshold; assigning one of a first estimation factor and a second estimation factor to the column, the assigning comprising the first estimation factor if the column is classified as categorical and the second estimation factor if the column is classified as non-categorical, the second estimation factor being larger than the first estimation factor; calculating a cost estimate for system resources required to perform a database operation on the database table, the cost estimate comprising an estimated total number of distinct values in the column across all of the plurality of data partitions determined using the assigned first estimation factor or second estimation factor and a number of rows in the table as inputs to an estimation function; and promoting the cost estimate.
地址 Walldorf DE