发明名称 |
System and method for large-scale data processing using an application-independent framework |
摘要 |
A large-scale data processing system and method for processing data in a distributed and parallel processing environment. The system includes an application-independent framework for processing data having a plurality of application-independent map modules and reduce modules. These application-independent modules use application-independent operators to automatically handle parallelization of computations across the distributed and parallel processing environment when performing user-specified data processing operations. The system also includes a plurality of user-specified, application-specific operators, for use with the application-independent framework to perform a user-specified data processing operation on a user-specified set of input files. The application-specific operators include: a map operator and a reduce operator. The map operator is applied by the application-independent map modules to input data in the user-specified set of input files to produce intermediate data values. The reduce operator is applied by the application-independent reduce modules to process the intermediate data values to produce final output data. |
申请公布号 |
US8612510(B2) |
申请公布日期 |
2013.12.17 |
申请号 |
US20100686292 |
申请日期 |
2010.01.12 |
申请人 |
DEAN JEFFREY;GHEMAWAT SANJAY;GOOGLE INC. |
发明人 |
DEAN JEFFREY;GHEMAWAT SANJAY |
分类号 |
G06F15/16 |
主分类号 |
G06F15/16 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|