发明名称 Task execution and management in a clustered computing environment
摘要 Machines, systems and methods for task management in a computer implemented system. The method comprises registering a task with brokers residing on one or more nodes to manage the execution of a task to completion, wherein a first broker is accompanied by a first set of worker threads co-located on the node on which the first broker is executed, wherein the first broker assigns responsibility of execution for the task to the one or more worker threads in the first set of co-located worker threads, wherein in response to a failure associated with a first worker thread in the first set, the first broker reassigns the responsibility of execution for the task to a second worker thread in the first set, wherein in response to a failure associated with the first broker, a second broker assigns responsibility of execution for the task to one or more co-located worker threads.
申请公布号 US9223626(B2) 申请公布日期 2015.12.29
申请号 US201213598638 申请日期 2012.08.30
申请人 International Business Machines Corporation 发明人 Factor Michael E.;Hadas David;Kolodner Elliot K.
分类号 G06F9/46;G06F9/50;G06F11/14;G06F11/20 主分类号 G06F9/46
代理机构 代理人 Sharkan Noah A.;Erez Suzanne
主权项 1. A task management method in a computer implemented system, the method comprising: registering a task with one or more brokers residing on one or more nodes to manage the execution of the task to completion, designating a lead broker based on one or more of: a random designation, a vote taken from a plurality of brokers, or a workload of a broker in relation to other brokers, wherein a first broker is accompanied by a first set of dedicated worker threads co-located on the one or more nodes on which the first broker is executed, wherein the first broker assigns responsibility of execution for the task to the one or more dedicated worker threads in the first set of dedicated worker threads, wherein in response to a failure associated with a first dedicated worker thread in the first set of dedicated worker threads, the first broker reassigns the responsibility of execution for the task to a second dedicated worker thread in the first set of dedicated worker threads, wherein in response to a failure associated with the first broker, a second broker assigns responsibility of execution for the task to one or more dedicated worker threads of a second set of dedicated worker threads co-located on one or more nodes with the second broker, wherein the designated lead broker requests one or more brokers to store replicas of the task until a level of replication reaches a predefined threshold, and wherein in response to the task being completed, the lead broker performs one or more of: notifying the requested one or more brokers of the completion of the task, and deleting the task from shared storage.
地址 Armonk NY US