摘要 |
A method and apparatus are provided for performing cross-host root cause diagnosis within a complex multi-host environment. In a multi-host environment, sometimes system failures on one host may cause problems at another host within the same environment. A probabilistic model is used to represent failures that can occur within each host in the environment. The cause and effect relationships among these failures together with measurement values are used to generate a probability that each potential failure occurred in each host. When a problem is observed on one host without detecting a corresponding root cause within the same host, a cross-host failure diagnosis is performed. The probabilistic models for other hosts in the environment are used to determine the most likely cause of the failure. |