主权项 |
1. In a distributed database management processing system including a plurality of nodes and means at each node for establishing a communications path with every other node in the system and each node includes means for transmitting a ping message and for transmitting a ping acknowledgement message in response to a received ping message, a method for detecting failures of individual nodes in the system comprising the steps of:
A) designating one node as a leader node for analyzing information, B) at each node as an I-node node, transmitting a ping message to each other node as a receiving node, monitoring at the I-node the corresponding communications path for a valid response from each receiving node, and responding to an invalid response by designating the corresponding receiving node as a suspicious node, C) generating a message for transmittal to the leader node with an identification of the I-node and the suspicious node, D) responding to the message in the leader node by identifying other instances for which communications problems have been recorded with the identified suspicious node, determining a number of I-nodes included in the other instances, and where fewer than a majority of the I-nodes are included in other instances, sending an acknowledgement message to all the I-nodes, and E) selectively designating suspicious nodes as failed where the majority of the I-nodes identify the suspicious nodes in a generated message or the majority of the I-nodes identify the suspicious nodes in a response to the acknowledgment message. |