发明名称 PARALLELIZING SQL ON DISTRIBUTED FILE SYSTEMS
摘要 Example embodiments relate to parallelizing structured query language (SQL) on distributed file systems. In example embodiments, a subquery of a distributed file system is received from a query engine, where the subquery is one of multiple subqueries that are scheduled to execute on a cluster of server nodes. At this stage, a user defined function that comprises local, role-based functionality is executed, where the partitioned magic table triggers parallel execution of the user defined function. The execution of the UDF determines a sequence number based on a quantity of the cluster of server nodes and retrieve nonconsecutive chunks from a file of the distributed file system, where each of the nonconsecutive chunks is offset by the sequence number.
申请公布号 US2017011090(A1) 申请公布日期 2017.01.12
申请号 US201415114328 申请日期 2014.03.31
申请人 HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP 发明人 Chen Qiming;Hsu Meichun;Castellanos Maria G.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A system for parallelizing structured query language (SQL) on distributed file systems, the system comprising: a storage device configured to store a partitioned magic table that is partitioned across a plurality of server nodes; a processor to: receive a subquery of a distributed file system from a query engine, wherein the subquery is one of a plurality of subqueries that are scheduled to execute on the plurality of server nodes;execute a user defined function in parallel, wherein the partitioned magic table triggers the parallel execution of the user defined function;determine a sequence number based on a quantity of the plurality of server nodes; andretrieve a plurality of nonconsecutive chunks from a file of the distributed file system, wherein each of the plurality of nonconsecutive chunks is offset by the sequence number.
地址 Houston TX US