摘要 |
System-directed checkpointing is accomplished following each checkpoint by mapping all memory pages, including read-only pages and read/write pages as read-only pages. Therefore, when an attempt is made to write to a page, a page-fault interrupt is generated. If the page is a read-only page, then normal page-fault interrupt protocol is followed. If the page is a read/write page that has temporarily been labeled read-only, the page is copied to a buffer and the memory map is changed to indicate that the page is now a read/write page. The pages in the buffer can then be used to restore the system after a fault. In accordance with another embodiment of the invention, after the aforementioned system interrupt occurs, the identity of the page is recorded in a backup computer, but the page itself is not copied. In addition, the locations of all pages modified through I/O events are also recorded. At the time of a checkpoint, the checkpoint software copies the contents of all modified pages to a memory in the backup computer. The backup computer can then be used to restart the system after a fault. This latter technique can also be used in a clustered environment with one computer effectively serving as a backup for every other computer in the cluster.
|