发明名称 Coalescing operation for query processing
摘要 According to one embodiment of the present invention, a system for processing queries analyzes statistical information of input data records in relation to a first operation for a query. The system applies the first operation to a plurality of groups of input data records to produce corresponding groups of output data records, and coalesces the sets of output data records to form larger sets of data records for input to a subsequent second operation for the query based on the analysis. Embodiments of the present invention further include a method and computer program product for processing queries in substantially the same manners described above.
申请公布号 US9569492(B2) 申请公布日期 2017.02.14
申请号 US201414148944 申请日期 2014.01.07
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 Aute Ravindra D.;Gopal Venkatesh S.
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 Edell, Shapiro & Finnan, LLC 代理人 Kashef Mohammed;Edell, Shapiro & Finnan, LLC
主权项 1. A system for processing queries comprising: at least one processor configured to: analyze, in a column-wise relational database management system, statistical information of input data records in relation to a first operation for a query, wherein the input data records are divided into chunks having a predefined, system-wide, size, the first operation is a filtering operation that produces output groups of a smaller size than the chunks of the input data records, and the analysis comprises estimating a selectivity value that depends upon a number of output records for the first operation;set a predetermined threshold such that a performance overhead of a coalescing operation on the output groups is less than a performance cost of performing subsequent steps on the produced output groups;apply the first operation to the chunks of the input data records to produce the corresponding output groups of output data records;in response to the selectivity value being below the predetermined threshold, coalesce the output groups of the output data records to form larger output groups of the output data records for input to a subsequent second operation for the query based on the analysis;generate a plan to determine query results, wherein the plan comprises the first and second operations; andinsert a coalescing operation into the plan to operate on the output groups of the output data records in response to the selectivity value being below the predetermined threshold.
地址 Armonk NY US