发明名称 Maintaining High Availability During Network Partitions for Virtual Machines Stored on Distributed Object-Based Storage
摘要 Techniques are disclosed for maintaining high availability (HA) for virtual machines (VMs) running on host systems of a host cluster, where each host system executes a HA module in a plurality of HA modules and a storage module in a plurality of storage modules, where the host cluster aggregates, via the plurality of storage modules, locally-attached storage resources of the host systems to provide an object store, where persistent data for the VMs is stored as per-VM storage objects across the locally-attached storage resources comprising the object store, and where a failure causes the plurality of storage modules to observe a network partition in the host cluster that the plurality of HA modules do not. In one embodiment, a host system in the host cluster executing a first HA module invokes an API exposed by the plurality of storage modules for persisting metadata for a VM to the object store. If the API is not processed successfully, the host system: (1) identifies a subset of second HA modules in the plurality of HA modules; (2) issues an accessibility query for the VM to the subset of second HA modules in parallel, the accessibility query being configured to determine whether the VM is accessible to the respective host systems of the subset of second HA modules; and (3) if at least one second HA module in the subset indicates that the VM is accessible to its respective host system, transmits a command to the at least one second HA module to invoke the API on its respective host system.
申请公布号 US2015378761(A1) 申请公布日期 2015.12.31
申请号 US201414317712 申请日期 2014.06.27
申请人 VMware, Inc. 发明人 Sevigny Marc;Farkas Keith;Karamanolis Christos
分类号 G06F9/455;G06F9/54 主分类号 G06F9/455
代理机构 代理人
主权项 1. A method for maintaining high availability (HA) for virtual machines (VMs) running on host systems of a host cluster, wherein each host system executes a HA module in a plurality of HA modules and a storage module in a plurality of storage modules, wherein the host cluster aggregates, via the plurality of storage modules, locally-attached storage resources of the host systems to provide an object store, wherein persistent data for the VMs is stored as per-VM storage objects across the locally-attached storage resources comprising the object store, and wherein a failure causes the plurality of storage modules to observe a network partition in the host cluster that the plurality of HA modules do not, the method comprising: invoking, by a host system in the host cluster executing a first HA module, an application programming interface (API) exposed by the plurality of storage modules for persisting metadata for a VM to the object store; and if the API is not processed successfully: identifying, by the host system, a subset of second HA modules in the plurality of HA modules;issuing, by the host system, an accessibility query for the VM to the subset of second HA modules in parallel, the accessibility query being configured to determine whether the VM is accessible to the respective host systems of the subset of second HA modules; andif at least one second HA module in the subset indicates that the VM is accessible to its respective host system, transmitting, by the host system, a command to the at least one second HA module to invoke the API on its respective host system.
地址 Palo Alto CA US