发明名称 System and method for building a point-in-time snapshot of an eventually-consistent data store
摘要 A method and system for building a point-in-time snapshot of an eventually-consistent data store. The data store includes key-value pairs stored on a plurality of storage nodes. In one embodiment, the data store is implemented as an Apache® Cassandra database running in the “cloud.” The data store includes a journaling mechanism that stores journals (i.e., inconsistent snapshots) of the data store on each node at various intervals. In Cassandra, these snapshots are sorted string tables that may be copied to a back-up storage location. A cluster of processing nodes may retrieve and resolve the inconsistent snapshots to generate a point-in-time snapshot of the data store corresponding to a lagging consistency point. In addition, the point-in-time snapshot may be updated as any new inconsistent snapshots are generated by the data store such that the lagging consistency point associated with the updated point-in-time snapshot is more recent.
申请公布号 US9613104(B2) 申请公布日期 2017.04.04
申请号 US201213399467 申请日期 2012.02.17
申请人 NETFLIX, Inc. 发明人 Smith Charles;Magnusson Jeffrey;Anand Siddharth
分类号 G06F7/00;G06F17/30;G06F11/14;G06F11/16;G06F11/20 主分类号 G06F7/00
代理机构 Artegis Law Group, LLP 代理人 Artegis Law Group, LLP
主权项 1. A computer-implemented method for building a point-in-time snapshot of an eventually-consistent data store distributed among a plurality of nodes connected by a network, the method comprising: receiving a plurality of inconsistent snapshots, wherein each inconsistent snapshot includes one or more rows of key-value pairs associated with the data store and reflects contents of at least a portion of the data store stored on a particular node of the plurality of nodes; and generating the point-in-time snapshot by resolving the one or more rows of the key-value pairs to remove any inconsistent values, wherein the point-in-time snapshot includes a subset of the key-value pairs included in the plurality of inconsistent snapshots, wherein generating the point-in-time snapshot comprises: dividing the one or more rows of the key-value pairs from the plurality of inconsistent snapshots into one or more processing tasks, wherein each processing task includes a different portion of the key-value pairs;distributing each processing task to one of a plurality of processing nodes configured to perform a reduce operation;receiving a number of results from the plurality of processing nodes corresponding to a number of distributed processing tasks; andcombining the number of results to generate the point-in-time snapshot.
地址 Los Gatos CA US
您可能感兴趣的专利