发明名称 BACKGROUND FORMAT OPTIMIZATION FOR ENHANCED SQL-LIKE QUERIES IN HADOOP
摘要 A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.
申请公布号 US2017032003(A1) 申请公布日期 2017.02.02
申请号 US201615292053 申请日期 2016.10.12
申请人 Cloudera, Inc. 发明人 Kornacker Marcel;Erickson Justin;Li Nong;Kuff Lenni;Robinson Henry Noel;Choi Alan;Behm Alex
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method of data processing for query execution, the method being performed by a query engine instance running on each data node of a plurality of data nodes which together form a Hadoop™ distributed computing cluster, wherein a query is processed by whichever data node that receives the query, the method comprising: storing initial data in an original format at a data node in the plurality of data nodes forming a peer-to-peer network for the query, each data node functioning as a peer in the peer-to-peer network and being capable of interacting with components of the Hadoop™ cluster, each peer having an instance of a query engine running in memory; converting, at the data node, the initial data to be in a target format that is optimized for relational database processing according to a predetermined schedule; and storing the converted data together with the initial data.
地址 Palo Alto CA US
您可能感兴趣的专利