发明名称 HANDLING MULTIPLE TASK SEQUENCES IN A STREAM PROCESSING FRAMEWORK
摘要 The technology disclosed improves existing streaming processing systems by allowing the ability to both scale up and scale down resources within an infrastructure of a stream processing system. In particular, the technology disclosed relates to a dispatch system for a stream processing system that adapts its behavior according to a computational capacity of the system based on a run-time evaluation. The technical solution includes, during run-time execution of a pipeline, comparing a count of available physical threads against a set number of logically parallel threads. When a count of available physical threads equals or exceeds the number of logically parallel threads, the solution includes concurrently processing the batches at the physical threads. Further, when there are fewer available physical threads than the number of logically parallel threads, the solution includes multiplexing the batches sequentially over the available physical threads.
申请公布号 US2017075693(A1) 申请公布日期 2017.03.16
申请号 US201514986351 申请日期 2015.12.31
申请人 salesforce.com, inc. 发明人 Bishop Elden Gregory;Chao Jeffrey
分类号 G06F9/38;G06F9/50;G06F9/48 主分类号 G06F9/38
代理机构 代理人
主权项 1. A method of handling multiple task sequences, including long tail task sequences, on a limited number of worker nodes of a stream processing system, the method including: defining containers over worker nodes that have physical threads, with one physical thread utilizing a whole processor core of a worker node, for multiple task sequences, queuing data from incoming near real-time (NRT) data streams in pipelines that run in the containers, processing data from the NRT data streams as batches using a container-coordinator that controls dispatch of the batches, and dispatching the batches to the physical threads, where a batch runs to completion or to a time out, including: during execution, comparing a count of available physical threads against a set number of logically parallel threads,when a count of available physical threads equals or exceeds the number of logically parallel threads, concurrently processing the batches at the physical threads, andwhen there are fewer available physical threads than the number of logically parallel threads, multiplexing the batches sequentially over the available physical threads.
地址 San Francisco CA US