发明名称 Automatic exploitation of data parallelism in streaming applications
摘要 An embodiment of the invention provides a method for exploiting stateless and stateful data parallelism in a streaming application, wherein a compiler determines whether an operator of the streaming application is safe to parallelize based on a definition of the operator and an instance of the definition. The operator is not safe to parallelize when the operator has selectivity greater than 1, wherein the selectivity is the number of output tuples generated for each input tuple. Parallel regions are formed within the streaming application with the compiler when the operator is safe to parallelize. Synchronization strategies for the parallel regions are determined with the compiler, wherein the synchronization strategies are determined based on the definition of the operator and the instance of the definition. The synchronization strategies of the parallel regions are enforced with a runtime system.
申请公布号 US9170794(B2) 申请公布日期 2015.10.27
申请号 US201213596676 申请日期 2012.08.28
申请人 International Business Machines Corporation 发明人 Gedik Bugra;Hirzel Martin J.;Schneider Scott A.;Wu Kun-Lung
分类号 G06F9/45 主分类号 G06F9/45
代理机构 Cahn & Samuels, LLP 代理人 Cahn & Samuels, LLP
主权项 1. A method for exploiting stateless and stateful data parallelism in a streaming application, said method comprising: determining with a compiler whether an operator of the streaming application is safe to parallelize, said determining whether an operator of the streaming application is safe to parallelize being based on a definition of the operator and an instance of the definition, the definition of the operator including a template of the operator, and the instance of the definition including a modified version of the template, the operator being not safe to parallelize when the operator has selectivity greater than 1, the selectivity being a number of output tuples generated for each input tuple; forming parallel regions within the streaming application with the compiler when the operator is safe to parallelize; determining synchronization strategies for the parallel regions with the compiler, the synchronization strategies being determined based on the definition of the operator and the instance of the definition; enforcing the synchronization strategies of the parallel regions with a runtime system; and performing a shuffle when two adjacent parallel regions include incompatible keys.
地址 Armonk NY US