主权项 |
1. A method for processing a fault for use in a virtual computer system having a plurality of Logical PARtitions (LPAR)s generated on a first physical computer comprising a first hypervisor and a second physical computer comprising a second hypervisor, wherein a first and second LPAR constitute a first cluster, wherein a third and fourth LPAR constitute a second cluster, wherein the second LPAR comprises a first cluster control unit, and wherein the third LPAR comprises a second cluster control unit, the method comprising:
determining, by the first hypervisor, in response to an occurrence of a fault in the first physical computer, whether one of the plurality of LPARs is available to continue execution on the first physical computer; stopping, by the first hypervisor, on a condition that one of the plurality of LPARs is not available to continue execution, a first LPAR that is not available to continue execution and causing the first cluster control unit in the second LPAR that is generated on the second physical computer to conduct a first failover of the first LPAR to the second LPAR; and conducting, by the second hypervisor, on a condition that one of the plurality of LPARs is available to continue execution by causing the first cluster control unit to conduct a second failover to the fourth LPAR, wherein the fourth LPAR is available to continue execution to conduct the second failover of the third LPAR to the fourth LPAR, wherein on a condition that the first and second hypervisors have fault notice information, the fault notice information is used to determine whether a fault notice request is present for every LPAR and is used to determine whether the LPAR can be stopped after failover based on a fault that does not affect execution of the LPARs, the first hypervisor refers to fault notice information the first hypervisor has, and if there is a request for a fault notice from the third LPAR that is available to continue execution, the first hypervisor transmits the fault notice to the third LPAR, and on a condition that the second cluster control unit has received the fault notice, has failover request information to manage a situation of the second failover, and sets presence of a request for the second failover in the failover request information, the second cluster control unit refers to the failover request information, and on a condition that there is the failover request, the second cluster control unit conducts the second failover, and upon completion of the second failover, the second cluster control unit sets a stop possibility for the third LPAR in the fault notice information in the first hypervisor to “possible” after the second failover. |