发明名称 System and method for event-driven live migration of multi-process applications
摘要 A system, method, and computer readable medium for asynchronous live migration of applications between two or more servers. The computer readable medium includes computer-executable instructions for execution by a processing system. Primary applications runs on primary hosts and one or more replicated instances of each primary application run on one or more backup hosts. Asynchronous live migration is provided through a combination of process replication, logging, barrier synchronization, checkpointing, reliable messaging and message playback. The live migration is transparent to the application and requires no modification to the application, operating system, networking stack or libraries.
申请公布号 US9043640(B1) 申请公布日期 2015.05.26
申请号 US201012957637 申请日期 2010.12.01
申请人 Open Invention Network, LLP 发明人 Havemose Allan
分类号 G06F11/00;G06F11/20 主分类号 G06F11/00
代理机构 Haynes and Boone, LLP 代理人 Haynes and Boone, LLP
主权项 1. A system for providing live migration of a primary application to one or more backup applications, the system comprising: one or more computer system memory locations configured to store said primary application; one or more Central Processing Units (CPUs) operatively connected to said computer system memory and configured to execute said primary application on a primary host with a host operating system; one or more interceptors configured to intercept calls to the host operating system and shared libraries, and configured to generate replication messages based on said intercepted calls, wherein said interceptors intercept at least one or more of process operations, thread operations, file operations, lock operations, Input operations/Output operations, and resource operations; a messaging layer for said primary application configured to transmit said replication messages to the one or more backups; a logging facility for said messaging layer configured to log all replication messages and checkpoints; a checkpointing service for said primary application configured to checkpoint said primary application; and one or more backup hosts each with a host operating system and each comprising: computer system memory comprising one or more computer system memory locations configured to store one or more backup applications, and one or more Central Processing Units (CPUs) operatively connected to said computer system memory and configured to execute said one or more backup applications; one or more interceptors configured to intercept calls to said one or more backup host operating systems and shared libraries; a messaging layer for each one or more backup applications configured to provide ordered receipt of said replication messages; and one or more checkpointing services for said one or more backup applications; wherein live migration is performed in response to an event; wherein a replication message optionally contains a DATA block in which the interceptor for the operation for said replication message stores information required during message replay comprised of at least one of a return value, results, parameters, or state for said operation, or no DATA block if no such information is required; wherein restore on a backup is initiated by loading the most recent checkpoint from the logging facility; wherein replication messages subsequent to said most recent checkpoint are read from the logging facility and replayed to the backup; and wherein said replayed replication messages subsequent to said most recent checkpoint uses information stored in a DATA block instead of executing the operations associated with the replication message if said DATA block is available including using at least one of a return value, results, parameters, or a state generated by the primary application and transmitted using a DATA block instead of executing the operation on the backup, or executes the operation if said DATA block is not available.
地址 Durham NC US