发明名称 Operator graph changes in response to dynamic connections in stream computing applications
摘要 A stream computing application may permit one job to connect to a data stream of a different job. As more and more jobs dynamically connect to the data stream, the connections may have a negative impact on the performance of the job that generates the data stream. Accordingly, a variety of metrics and statistics (e.g., CPU utilization or tuple rate) may be monitored to determine if the dynamic connections are harming performance. If so, the stream computing system may be optimized to mitigate the effects of the dynamic connections. For example, particular operators may be unfused from a processing element and moved to a compute node that has available computing resources. Additionally, the stream computing application may clone the data stream in order to distribute the workload of transmitting the data stream to the connected jobs.
申请公布号 US9135057(B2) 申请公布日期 2015.09.15
申请号 US201313780800 申请日期 2013.02.28
申请人 International Business Machines Corporation 发明人 Branson Michael J.;Cradick Ryan K.;Santosuosso John M.;Schulz Brandon W.
分类号 G06F9/46;G06F9/455;G06F9/48;G06F9/50;G06F11/34 主分类号 G06F9/46
代理机构 Patterson & Sheridan, LLP 代理人 Patterson & Sheridan, LLP
主权项 1. A method for optimizing a stream computing application, comprising: executing a first job and a second job, each comprising a plurality of respective operators that process streaming data by operation of one or more computer processors, wherein the plurality of respective operators in the first and second jobs are, respectively, interconnected such that data tuples flow between the plurality of respective operators to perform the first and second jobs; establishing an operator graph comprising the plurality of respective operators of both the first and second jobs, the operator graph defining at least one respective execution path through the plurality of respective operators for the first job and for the second job; while the first and second jobs are executing, establishing a connection between the first job and the second job by transmitting a data stream from a first operator of the first job to a second operator of the second job, wherein the first and second jobs are in the operator graph both before and after the connection is established; before establishing the connection between the first job to the second job, setting the data stream as exportable, wherein the plurality of respective operators associated with the second job do not receive data from, or send data to, the plurality of respective operators associated with the first job prior to setting the data stream as exportable; monitoring a performance indicator associated with the first operator of the first job, the performance indicator measuring an effect that the connection between the first and second jobs has on a performance of the first job; and upon determining a value of the performance indicator satisfies a predefined threshold, optimizing the stream computing application to improve the value of the performance indicator.
地址 Armonk NY US