摘要 |
Computing environments, each executing at least one software program, are monitored for failures occurring during execution of the software program. Information associated with the failure, such as an identification of the software program and a failure type describing the failure, is recorded. The failure information is quantified to report the number of times the software program has failed or the number of times a particular failure has occurred. The quantified data may provide help in prioritizing what program or what failures merit investigation and resolution. Reports may be received from failing computing systems stopped at a state following the occurrence of the failure. In response, hold information is checked to determine whether to instruct the failing computing system to hold a state existing upon the occurrence of the failure.
|