发明名称 METHOD FOR FAILURE-RESILIENT DATA PLACEMENT IN A DISTRIBUTED QUERY PROCESSING SYSTEM
摘要 Herein is described a data placement scheme for a distributed query processing systems that achieves load balance amongst the nodes of the system. To identify a node on which to place particular data, a supervisor node performs a placement algorithm over the particular data's identifier, where the placement algorithm utilizes two or more hash functions. The supervisor node runs the placement algorithm until a destination node is identified that is available to store the data, or the supervisor node has run the placement algorithm an established number of times. If no available node is identified using the placement algorithm, then an available destination node is identified for the particular data and information identifying the data and the selected destination node is included in an exception map. Most data may be located by any node in the system based on the node performing the placement algorithm for the required data.
申请公布号 US2016328456(A1) 申请公布日期 2016.11.10
申请号 US201514704825 申请日期 2015.05.05
申请人 Oracle International Corporation 发明人 Zhang Gong;Petride Sabina;Klots Boris;Idicula Sam;Agarwal Nipun
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computerized distributed query processing system comprising: a plurality of computing devices, each computing device being configured with a data store; and a supervisor computing device communicatively connected to the plurality of computing devices; wherein the supervisor computing device is configured to: identify a particular computing device of the plurality of computing devices as a destination computing device of a particular unit of data;wherein the particular unit of data is uniquely identified, among all units of data stored on the computerized distributed query processing system, by a particular data identifier;to identify said particular computing device, said supervisor computing device is configured to: perform a placement function, comprising two or more hash functions, based, at least in part, on the particular data identifier,wherein said supervisor computing device being configured to perform the placement function comprises said supervisor computing device being configured to: combine results of the two or more hash functions to produce combined results, andidentify the particular computing device to be the destination computing device of the particular unit of data based on the combined results; andto cause the particular unit of data to be stored on the data store of the particular computing device as the destination computing device of the particular unit of data.
地址 Redwood Shores CA US