发明名称 DISTRIBUTED DATA REORGANIZATION FOR PARALLEL EXECUTION ENGINES
摘要 A distributed data reorganization system and method for mapping and reducing raw data containing a plurality of data records. Embodiments of the distributed data reorganization system and method operate in a general-purpose parallel execution environment that use an arbitrary communication directed acyclic graph. The vertices of the graph accept multiple data inputs and generate multiple data inputs, and may be of different types. Embodiments of the distributed data reorganization system and method include a plurality of distributed mappers that use a mapping criteria supplied by a developer to map the plurality of data records to data buckets. The mapped data record and data bucket identifications are input for a plurality of distributed reducers. Each distributed reducer groups together data records having the same data bucket identification and then uses a merge logic supplied by the developer to reduce the grouped data records to obtain reorganized data.
申请公布号 US2010281078(A1) 申请公布日期 2010.11.04
申请号 US20090433880 申请日期 2009.04.30
申请人 MICROSOFT CORPORATION 发明人 WANG TAIFENG;LIU TIE-YAN
分类号 G06F7/00;G06F3/048;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址