摘要 |
<p>A method and apparatus for providing a fault-tolerant backup system such that if there is a failure of a primary processing system(21), a replicated system (22) can take over without interruption. The invention provides a software solution for providing a backup system. Two servers are provided, a primary (21) and secondary server (22). The two servers are connected via a communications channel (15). The servers have associated with them an operating system. The present invention divides this operating system into two 'engines' (10, 16). An O/I engine (12, 18) is responsible for handling and receiving all data and asynchronous events on the system. The I/O engine controls and interfaces with physical devices (44) and device drivers. The operating system (OS) engine is used to operate on data received from the I/O engine. All events or data which can change the state of the operating system are channeled through the I/O engine and converted to a message format. The I/O engine on the servers coordinate with each other and provide the same sequence of messages to the OS engines. The messages are provided to a message queue accessed by the engine. Therefore, regardless of the timing of the events, (i.e., asynchronous events), the OS engine receives all events sequentially through a continuous sequential stream of input data. As a result, the OS engine is finite state automata with one-dimensional input 'view' of the rest of the system and the state of the OS engines on both primary and secondary servers will converge.</p> |