发明名称 A system and method for comprehensive availability management in a high-availability computer system
摘要 A system and method for availability management coordinates operational states of components to implement a desired redundancy model within a high-availability computing system. Within the availability management system, an availability manager monitors various reports on the status of components and nodes within the system. The availability manager uses these reports to direct components to change states if necessary, in order to maintain the desired system redundancy model. The availability management system includes a health monitor for performing component status audits upon individual components and reporting component status changes. The system also includes a watch-dog timer, which monitors the health monitor and reboots the entire node containing the health monitor if it becomes non-responsive. Each node within the system also includes a cluster membership monitor, which monitors nodes becoming non-responsive and reports node non-responsive errors. The availability management system also includes a multicomponent error correlator (MCEC), which uses pre-specified rules to correlate multiple specific and non-specific errors and infer a particular component problem. If a particular component problem is found, the MCEC reports a component status change to the availability manager. <IMAGE>
申请公布号 EP1134658(A3) 申请公布日期 2002.06.19
申请号 EP20010650026 申请日期 2001.03.14
申请人 SUN MICROSYSTEMS, INC. 发明人 KAMPE, MARK A.;HISGEN, ANDREW
分类号 G06F11/00 主分类号 G06F11/00
代理机构 代理人
主权项
地址