发明名称 Progressive retry method and apparatus having reusable software modules for software failure recovery in multi-process message-passing applications
摘要 A progressive retry recovery system based on checkpointing, message logging, rollback, message replaying and message reordering is disclosed. The disclosed progressive retry system minimizes the number of involved processes as well as the total rollback distance. The progressive retry recovery system includes a fault tolerant software library which provides a number of functions which may be invoked by application processes to implement fault tolerance. Fault tolerant functions are provided for allowing an application process to generate a heartbeat message at specified intervals indicating that the application process is still active. In addition, fault tolerance implementation functions are provided for specifying critical memory, for executing checkpoints to store backup copies of critical data, and for restoring critical data during a recovery. In addition, functions are provided which process messages that are sent or received by an application process and maintain logs of the sent and received messages. The progressive retry recovery method consists of a number of retry steps which gradually increase the scope of the rollback when a previous retry step fails.
申请公布号 US5440726(A) 申请公布日期 1995.08.08
申请号 US19940263916 申请日期 1994.06.22
申请人 AT&T CORP. 发明人 FUCHS, WESLEY K.;HUANG, YENNUN;KINTALA, CHANDRA M.;WANG, YI-MIN
分类号 G06F11/14;G06F15/00;(IPC1-7):G06F11/18 主分类号 G06F11/14
代理机构 代理人
主权项
地址