摘要 |
<p>Failures in a fault-tolerant computer system which includes two or more input/output processors connected to a data communication system are detected by monitoring data communication. The computer system is able to detect failures associated with a primary input/output processor, as well as with a standby input/output processors, and is also able to discriminate between failures of the input/output processors and communication failures in the data communication network itself. In addition to using heartbeat-like transmissions, various other categories of data communication are also used to detect failures. The system is able to detect failures when the input/output processors are on a common network segment, allowing the processors to monitor identical data traffic, as well as when the processors are on different segments where, as a result of filtering behavior of network elements such as active hubs, the processors may not be able to monitor identical data traffic.</p> |