发明名称 Efficient pushdown of joins in a heterogeneous database system involving a large-scale low-power cluster
摘要 A system and method for allocating join processing between and RDBMS and an assisting cluster. In one embodiment, the method estimates a cost of performing the join completely in the RDBMS and the cost of performing the join with the assistance of a cluster coupled to the RDBMS. The cost of performing the join with the assistance of the cluster includes estimating a cost of a broadcast join or a partition join depending on the sizes of the tables. Additional costs are incurred when there is a blocking operation, which prevents the cluster from being able to process portions of the join. The RDBMS also maintains transactional consistency when the cluster performs some or all of the join processing.
申请公布号 US8849871(B2) 申请公布日期 2014.09.30
申请号 US201213645030 申请日期 2012.10.04
申请人 Oracle International Corporation 发明人 Idicula Sam;Petride Sabina;Agarwal Nipun;Sedlar Eric
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 Hickman Palermo Truong Becker Bingham Wong LLP 代理人 Hickman Palermo Truong Becker Bingham Wong LLP
主权项 1. A method for performing a join in a data set managed by a relational database management system (RDBMS) coupled to a cluster of nodes, each node configured to store a portion of the data set in non-persistent memory, the method comprising: storing, by the RDBMS, transactional data corresponding to one or more database transactions performed on the data set; determining a snapshot identifier of a query containing the join; estimating a cost of performing the join fully in the RDBMS without performing distributed join operation in the cluster of nodes; estimating a cost of performing the join with the assistance of the cluster by performing at least one distributed join operation in at least one node of the cluster of nodes; and performing the join with the assistance of the cluster when the estimated cost of performing the join with the assistance of the cluster is lower than the estimated cost of performing the join fully in the RDBMS; wherein the RDBMS ensures transactional consistency based on the transactional data when the join is performed with the assistance of the cluster by preventing the cluster from performing distributed join operations on data having a snapshot identifier later than the snapshot identifier of the query.
地址 Redwood Shores CA US