摘要 |
In a distributed caching and scheduling method for a shared nothing computing framework, the framework includes an aggregator node and multiple computing nodes with local processor, storage unit and memory. The method includes separating a dataset into multiple data segments; distributing the data segments across the local storage units; and for each computing node, copying the data segment from the storage unit to the memory; processing the data segment to compute a partial result; and sending the partial result to the aggregator node. The method includes determining the data segment stored in local memory of computing nodes; and coordinating additional computing jobs based on the determination of the data segment stored in local memory. Coordinating can include scheduling new computing jobs using the data segment already stored in local memory, or to maximize the use of the data segments already stored in local memories.
|