发明名称 Apparatus and method for passively monitoring liveness of jobs in a clustered computing environment
摘要 An apparatus and method passively determine when a job in a clustered computing environment is dead. Each node in the cluster has a cluster engine for communicating between each job on the node and jobs on other nodes. A protocol is defined that includes one or more acknowledge (ACK) rounds, and that only performs local processing between ACK rounds. The protocol is executed by jobs that are members of a defined group. Each job in the group has one or more work threads that execute the protocol. In addition, each job has a main thread that communicates between the job and jobs on other nodes (through the cluster engine), routes appropriate messages from the cluster engine to a work thread, and signals to the cluster engine when a fault occurs when the work thread executes the protocol. By assuring that a dead job is reported to other members of the group, liveness information for group members can be monitored without the overhead associated with active liveness checking.
申请公布号 US6990668(B1) 申请公布日期 2006.01.24
申请号 US19990421585 申请日期 1999.10.20
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 MILLER ROBERT;MOREY VICKI LYNN;THAYIB KISWANTO;WILLIAMS LAURIE ANN
分类号 H04L1/16;H04L29/06 主分类号 H04L1/16
代理机构 代理人
主权项
地址