主权项 |
1. A method of monitoring the health of a database appliance comprising a database distributed on a plurality of database nodes, and having redundant master nodes including a primary master node and a standby master node, the database being active on and controlled by only one of said redundant master nodes at a time, the method comprising:
executing concurrently in parallel and independently on both said redundant master nodes a database monitoring process, said database monitoring process comprising a resolution process and a health monitor process, the resolution process resolving on which one of said redundant master nodes said database is active at said time, said one node being designated the primary master node, and confirming that said primary master node is executing said health monitor process to monitor hardware and software components of said database and report alerts, the other redundant master node being the standby master node and not issuing alerts; resolving by executing said resolution process in parallel by said primary master node and said standby master node whether the database is running on said primary master node, including:
attempting, by the primary master node, a first login to the database on the primary master node; andattempting concurrently, by the standby master node, a second login to the database on the primary master node; upon said first and second logins being successful, resolving by the resolution process on the primary master node that the database is running on the primary master node and that the primary master node is executing said health monitor process of hardware and software components of the database; monitoring by the standby master node the status of the primary master node to detect a failure of the primary master node, including:
attempting, by the standby master node, a third login to the database on the primary master node after a first predetermined period of time;upon identifying that the third login attempt is unsuccessful, determining that the primary master node has failed based on the unsuccessful third login attempt; and upon determining said failure of the primary master node by the standby master node:
attempting, by the standby master node, a fourth login to the database on the standby master node; upon the fourth login attempt by the standby master node being successful, determining that the database is active on said standby master node; and executing said health monitor process of said components of said database by the standby master node in response to determining that the fourth login attempt was successful. |