主权项 |
1. A system for failure event detection and grouping using adaptive polling intervals and sliding window buffering, said system comprising:
a memory area associated with a computing device, said memory area storing a plurality of virtual machines (VMs) and datastores accessible thereto, a value for a short timer, and a value for a long timer; and a processor programmed to:
upon detection of a failure event affecting at least one of the plurality of VMs and/or datastores accessible thereto, poll for additional failure events during each of a series of polling intervals until the short timer or the long timer expires, the polling during each of the series of polling intervals comprising:
upon detection of at least one of the additional failure events, collecting data relating to the detected additional failure event, resetting the short timer, and reducing a duration of a next polling interval; andupon no detection of at least one of the additional failure events, increasing a duration of a next polling interval;group the detected failure event with the detected additional failure events; andperform recovery operations in parallel for each of the grouped failure events. |