发明名称 Processing element management in a streaming data system
摘要 Stream applications may inefficiently use the hardware resources that execute the processing elements of the data stream. For example, a compute node may host four processing elements and execute each using a CPU. However, other CPUs on the compute node may sit idle. To take advantage of these available hardware resources, a stream programmer may identify one or more processing elements that may be cloned. The cloned processing elements may be used to generate a different execution path that is parallel to the execution path that includes the original processing elements. Because the cloned processing elements contain the same operators as the original processing elements, the data stream that was previously flowing through only the original processing element may be split and sent through both the original and cloned processing elements. In this manner, the parallel execution path may use underutilized hardware resources to increase the throughput of the data stream.
申请公布号 US9535707(B2) 申请公布日期 2017.01.03
申请号 US201213709405 申请日期 2012.12.10
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 Branson Michael J.;Cradick Ryan K.;Santosuosso John M.;Schulz Brandon W.
分类号 G06F15/16;G06F9/44;G06F9/50 主分类号 G06F15/16
代理机构 Patterson + Sheridan, LLP 代理人 Patterson + Sheridan, LLP
主权项 1. A method, comprising: receiving streaming data to be processed by a plurality of processing elements, each comprising one or more operators, the operators processing at least a portion of the received data by operation of one or more computer processors; establishing an operator graph of the operators, the operator graph defining at least one execution path in which a first operator of the operators is configured to receive data tuples from at least one upstream operator and transmit data tuples to at least one downstream operator; identifying, relative to predefined criteria, a first and second underutilized hardware resource in a computing system that executes the operators; cloning a first processing element of the plurality of processing elements such that the cloned processing element comprises the same one or more operators as the first processing element; determining respective communication speeds between a hardware resource hosting the first operator and each of the first and second underutilized hardware resources, wherein the first operator is one of upstream and downstream of the cloned processing element in the operator graph; selecting, based on the respective communication speeds, one of the first and second underutilized hardware resources to host the cloned processing element; generating, after determining to clone the first processing element, a first execution path comprising the cloned processing element, the first execution path is configured to process at least a portion of the received streaming data using the selected underutilized hardware resource, wherein the first execution path executes in parallel to a previously configured second execution path that includes the first processing element; and activating, in the operator graph downstream from the first and second execution paths, a second processing element comprising a merge operator that merges data tuples received from both the first and second execution paths into a single data stream.
地址 Armonk NY US