发明名称 |
Compiler-guided software accelerator for iterative HADOOP® jobs |
摘要 |
Various methods are provided directed to a compiler-guided software accelerator for iterative HADOOP® jobs. A method includes identifying intermediate data, generated by an iterative HADOOP® application, below a predetermined threshold size and used less than a predetermined threshold time period. The intermediate data is stored in a memory device. The method further includes minimizing input, output, and synchronization overhead for the intermediate data by selectively using at any given time any one of a Message Passing Interface and Distributed File System as a communication layer. The Message Passing Interface is co-located with the HADOOP® Distributed File System. |
申请公布号 |
US9201638(B2) |
申请公布日期 |
2015.12.01 |
申请号 |
US201313923458 |
申请日期 |
2013.06.21 |
申请人 |
NEC Laboratories America, Inc. |
发明人 |
Ravi Nishkam;Verma Abhishek;Chakradhar Srimat T. |
分类号 |
G06F9/45;G06F9/52;G06F9/54 |
主分类号 |
G06F9/45 |
代理机构 |
|
代理人 |
Kolodka Joseph |
主权项 |
1. A method, comprising:
identifying a set of map tasks and reduce tasks capable of being reused across multiple iterations of an iterative HADOOP® application; and reducing a system load imparted on a computer system executing the iterative HADOOP® application by transforming a source code of the iterative HADOOP® application to launch the map tasks in the set only once and keep the map tasks in the set alive for an entirety of the execution of the iterative HADOOP® application; wherein the map tasks in the set are kept alive for the entirety of the execution by guarding an invocation to a runjob( ) function beginning at a first iteration of the iterative HADOOP® application to prevent a re-launching of any of the maps tasks and reduce tasks in the set in subsequent iterations of the iterative HADOOP® application, the invocation to the runJob( ) function is guarded by a flag, which is set to true for the first iteration and false for the subsequent iterations. |
地址 |
Princeton NJ US |