发明名称 General purpose distributed data parallel computing using a high level language
摘要 General-purpose distributed data-parallel computing using a high-level language is disclosed. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. The distributed execution plan is then executed on large compute clusters. Thus, the developer is allowed to write the program using familiar programming constructs in the high level language. Moreover, developers without experience with distributed compute systems are able to take advantage of such systems.
申请公布号 US9110706(B2) 申请公布日期 2015.08.18
申请号 US200912368231 申请日期 2009.02.09
申请人 Microsoft Technology Licensing, LLC 发明人 Yu Yuan;Fetterly Dennis;Isard Michael;Erlingsson Ulfar;Budiu Mihai
分类号 G06F9/445;G06F15/16;G06F9/45 主分类号 G06F9/445
代理机构 代理人 Yee Judy;Minhas Micky
主权项 1. A system, comprising: a processor readable storage hardware device; a user interface; and a processor coupled to the processor readable hardware storage device and to the user interface, wherein the processor readable hardware storage device includes instructions that cause the processor to: execute a sequential application program comprising a data parallel portion that includes an expression, wherein the application is written in a high-level language and comprises both imperative operations and declarative operations; access the expression from a portion of the sequential application program that comprises a declarative operation; based on the expression, automatically generate an execution plan graph, the execution plan graph including a directed graph having vertices that represent processes and edges between the vertices that represent data channels, the execution plan graph for executing the expression in parallel at nodes of a compute cluster, including causing the processor to break the expression into a plurality of sub-expressions, each of the sub-expressions is a vertex in the directed graph; automatically generate vertex code for the vertices of the execution plan graph; automatically generate serialization code that allows data to be passed in the data channels between the vertices; provide the execution plan graph, the serialization code, and the vertex code to an execution engine of the compute cluster that manages parallel execution of the expression in the compute cluster based on the execution plan graph, the serialization code, and the vertex code; receive results of executing the execution plan graph in the compute cluster; and execute a portion of the sequential application program that comprises an imperative operation to present the results in the user interface.
地址 Redmond WA US