摘要 |
Method and apparatus providing intelligent fault recovery are presented. The apparatus includes line card equipment separating control plane components from data plane components enabling a control plane reset while the data plane remains operational at least to the extent that content conveyed in respect of currently provisioned services continues to be processed therethrough. Detected faults are categorized in accordance with a group of severity levels and recovery behavior is specified for each fault severity level. As a guiding principle of the engineered failure mitigation response provided, the control plane of a fault affected line card is reset in an attempt to mitigate an experienced fault condition; the entire line card being reset only as a last resort and only to restore service. In the case of potentially service affecting faults, partially service affecting faults, and non-service affecting faults, the fault is tolerated to the extent possible in the absence of further information regarding what service impact the reset action would have. Meta-information, typically available from a remote communication network location, is employed in providing the engineered failure mitigation response. Advantages are derived from an engineered response to detected faults providing an increased line card component availability, and therefore an increased overall communications network infrastructure availability.
|