摘要 |
<p>A mechanism for maintaining a consistent state in main memory without constraining normal computer operation is provided, thereby enabling a computer system to recover from faults without loss of data or processing continuity. In a typical computer system, a processor and input/output elements are connected to a main memory via a memory bus. A shadow memory element, which includes a buffer memory and a main storage element, is also attached to this memory bus. During normal processing, data written to primary memory is also captured by the buffer memory of the shadow memory element. When a checkpoint is desired (thereby establishing a consistent state in main memory to which all executing applications can safely return following a fault), the data previously captured in the buffer memory is then copied to the main storage element of the shadow memory element. This structure and protocol can guarantee a consistent state in main memory, thus enabling fault-tolerant operation.</p> |