发明名称 |
PARALLELIZING SQL ON DISTRIBUTED FILE SYSTEMS |
摘要 |
Example embodiments relate to parallelizing structured query language (SQL) on distributed file systems. In example embodiments, a subquery of a distributed file system is received from a query engine, where the subquery is one of multiple subqueries that are scheduled to execute on a cluster of server nodes. At this stage, a user defined function that comprises local, role-based functionality is executed, where the partitioned magic table triggers parallel execution of the user defined function. The execution of the UDF determines a sequence number based on a quantity of the cluster of server nodes and retrieve nonconsecutive chunks from a file of the distributed file system, where each of the nonconsecutive chunks is offset by the sequence number. |
申请公布号 |
US2017011090(A1) |
申请公布日期 |
2017.01.12 |
申请号 |
US201415114328 |
申请日期 |
2014.03.31 |
申请人 |
HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP |
发明人 |
Chen Qiming;Hsu Meichun;Castellanos Maria G. |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A system for parallelizing structured query language (SQL) on distributed file systems, the system comprising:
a storage device configured to store a partitioned magic table that is partitioned across a plurality of server nodes; a processor to:
receive a subquery of a distributed file system from a query engine, wherein the subquery is one of a plurality of subqueries that are scheduled to execute on the plurality of server nodes;execute a user defined function in parallel, wherein the partitioned magic table triggers the parallel execution of the user defined function;determine a sequence number based on a quantity of the plurality of server nodes; andretrieve a plurality of nonconsecutive chunks from a file of the distributed file system, wherein each of the plurality of nonconsecutive chunks is offset by the sequence number. |
地址 |
Houston TX US |