发明名称 |
PERFORMING PARALLEL JOINS ON DISTRIBUTED DATABASE DATA |
摘要 |
The present invention extends to methods, systems, and computer program products for performing parallel joins on distributed database data. Embodiments of the invention include a phased semi-join reduction strategy using replication and shuffle operations to join a first and a second data source. A filter building phase uses replication and pushes down a Distinct (e.g., SQL) operator to produce a list of join keys for the first data source (one side of the join). A shuffle phase for the second data source is modified to join to the key list produced in the first phase as a row filtering mechanism. A join phase then joins the first and second data sources.
|
申请公布号 |
US2012317093(A1) |
申请公布日期 |
2012.12.13 |
申请号 |
US201113154911 |
申请日期 |
2011.06.07 |
申请人 |
TELETIA NIKHIL;HALVERSON ALAN DALE;BLAKELEY JOSE A.;JOSHI MILIND MADHUKAR;SABORIT JOSE AGUILAR;MICROSOFT CORPORATION |
发明人 |
TELETIA NIKHIL;HALVERSON ALAN DALE;BLAKELEY JOSE A.;JOSHI MILIND MADHUKAR;SABORIT JOSE AGUILAR |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|