发明名称 PLACEMENT POLICY
摘要 A region-based placement policy that can be used to achieve a better distribution of data in a clustered storage system is disclosed herein. The clustered storage system includes a master module to implement the region-based placement policy for storing one or more copies of a received data across many data nodes of the clustered storage system. When implementing the region-based placement policy, the master module splits the received data into one or more regions, where each region includes a contiguous portion of the received data. Further, for each of the plurality of regions, the master module stores complete copies of the region in a subset of the data nodes.
申请公布号 US2016132518(A1) 申请公布日期 2016.05.12
申请号 US201614996627 申请日期 2016.01.15
申请人 Facebook, Inc. 发明人 Muthukkaruppan Kannan;Ranganathan Karthik;Tang Liyin
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A clustered storage system comprising: a memory; one or more processors; and a master module that is in communication with one or more of multiple data nodes and that facilitates storage of data in the multiple data nodes, wherein the master module is configured to, when executed by the one or more processors: receive client data from a client system, wherein the client data comprises a data table including multiple rows and multiple columns;divide at least a portion of the data table, comprising at least two consecutive rows of the multiple rows and including less than all of the multiple rows, into two or more data files such that each data item in the portion of the data table with a common first column identifier is in a first of the two or more data files and each data item in the portion of the data table with a common second column identifier is in a second of the two or more data files;store the two or more data files in a primary data node by sending first file creation requests corresponding to each of the two or more files; andstore a replica of the portion of the data table, including replicas of the two or more data files, in a secondary data node by sending second file creation requests corresponding to each of the two or more files.
地址 Menlo Park CA US