摘要 |
A method and apparatus for maintaining an item-to-node mapping among nodes in a distributed cluster is provided. Each node maintains locally-stored system-state information indicating that node's understanding of which master nodes are alive and dead. Instead of employing a global item-to-node mapping, each node acts upon a locally determined mapping based on its locally-stored system-state information. For any two nodes with the same locally-stored system-state information, the locally determined mapping is the same. A node updates its locally-stored system-state information upon detecting a node failure or receiving a message from another node indicating different locally-stored system-state information. The new locally-stored system-state information is transmitted on a need-to-know basis, and consequently nodes with different item-to-node mappings may operate concurrently. Mechanisms to avoid nodes assuming conflicting ownership of items are employed, thus allowing node failures to propagate via asynchronous messaging instead of requiring a cluster-wide synchronization event.
|