发明名称 Integrating Execution of Computing Analytics within a Mapreduce Processing Environment
摘要 Embodiments of the disclosure can include MapReduce systems and methods with integral mapper and reducer compute runtime environments. An example system with an integral reducer compute runtime environment can include mappers and reducers executable on a computer cluster. The mappers can be operable to receive raw input data and generate first input data based on the raw input data. The mappers can be operable to generate first result data based on the first input data. Based on the first result data, the mappers can be operable to generate (K, V) pairs. The reducers can be operable to receive the (K, V) pairs and generate second input data based on the (K, V) pairs. The reducers can be operable to transmit the second input data to integral compute runtime environment being run within the reducers and operable to generate second result data based on the second input data. Based on the second result data, the reducers can be operable to generate output data.
申请公布号 US2015379022(A1) 申请公布日期 2015.12.31
申请号 US201414317687 申请日期 2014.06.27
申请人 General Electric Company 发明人 Puig Ernest Charles;Interrante John A.;Osborn Mark David;Pool Eric
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A MapReduce system with an integral reducer compute runtime environment, the system comprising: one or more mappers executable on a computer cluster, the one or more mappers operable to: receive raw input data;generate first input data based at least in part on the raw input data;generate first result data based at least in part on the first input data; andgenerate (K, V) pairs based at least in part on the first result data; and one or more reducers executable on the computer cluster, the one or more reducers operable to: receive the (K, V) pairs generated by the one or more mappers;generate second input data based at least in part on the (K, V) pairs;transmit the second input data, via one or more proxies associated with the one or more reducers, to at least one integral compute runtime environment, wherein the at least one integral compute runtime environment is run within the one or more reducers and operable to generate second result data based at least in part on the second input data; andgenerate output data based at least in part on the second result data.
地址 Schenectady NY US