发明名称 PROCESSING DATA FROM MULTIPLE SOURCES
摘要 In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.
申请公布号 US2015302075(A1) 申请公布日期 2015.10.22
申请号 US201414255579 申请日期 2014.04.17
申请人 Ab Initio Technology LLC 发明人 Schechter Ian;Wakeling Tim;Wollrath Ann M.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method including: at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage: executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster; receiving a computer-executable program by the data processing engine; executing at least part of the program by the first instance of the data processing engine; receiving, by the data processing engine, a second portion of data from the external data source; storing the second portion of data other than in HDFS storage; and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.
地址 Lexington MA US