发明名称 Method and apparatus for recovering from hardware faults
摘要 A method and apparatus for recovery from a fault occurring within a computing system using a hardware recovery module comprising a microprocessor dedicated for recovery control and a memory for storing system states. A recovery counter counts machine instructions executed since a previously recorded initial checkpoint. Each time the CPU transfers information directly from an I/O controller or the cache memory the recovery module stores the data being transferred. Each time an interrupt is made to the CPU, the recovery module is notified of the interrupt, and it thereupon stores the count of machine instructions executed since the previously recorded initial checkpoint and information identifying the interrupt. When a fault is detected, the system is restored to the system state existing at the beginning of the checkpoint, and the processor synthetically executes the machine instructions originally executed after the initial checkpoint in a sequence substantially similar to the original sequence. During synthetic execution, the recovery module simulates the original inputs, suppresses outputs, and records completion of pre-fault I/O requests. Synthetic execution is abandoned when the instruction point at which the fault was detected is reached, true execution resumes, and the recovery module thereafter simulates the completion of pre-fault I/O requests.
申请公布号 US4740969(A) 申请公布日期 1988.04.26
申请号 US19860879244 申请日期 1986.06.27
申请人 HEWLETT-PACKARD COMPANY 发明人 FREMONT, MICHAEL J.
分类号 G06F11/00;G06F11/14;(IPC1-7):G06F11/00 主分类号 G06F11/00
代理机构 代理人
主权项
地址