发明名称 Higher efficiency storage replication using compression
摘要 In one embodiment, there is a multi-cluster synchronization system between two or more clusters. The multi-cluster synchronization system uses variable compression to optimize the transfer of information between the clusters. Compression is used not only to minimize the total number of bytes sent between the two clusters, but to dynamically vary the size of the objects sent across the wire to optimize for higher throughput after considering packet loss, TCP windows, and block sizes. This includes both the packaging of multiple small files together into one larger compressed file, saving on TCP and header overhead, but also the chunking of large files into multiple smaller files that are less likely to have difficulties due to intermittent network congestion or errors. A further embodiment uses forward error correction to maximize the chances that the remote end will be able to correctly reconstitute the transmission.
申请公布号 US9560093(B2) 申请公布日期 2017.01.31
申请号 US201414323726 申请日期 2014.07.03
申请人 Rackspace US, Inc. 发明人 Holt Gregory Lee;Gerrard Clay;Goetz David Patrick;Barton Michael
分类号 G06F17/30;H04L29/06;G06F3/06;G06F11/20;H04L29/08;G06F11/10 主分类号 G06F17/30
代理机构 Haynes and Boone, LLP 代理人 Haynes and Boone, LLP
主权项 1. A multi-cluster synchronization system, comprising: an intercluster network coupling a first cluster and a remote cluster, the first cluster including a first cluster-internal network, a first structured information repository, and a first object storage, wherein the first structured information repository contains metadata corresponding to stored information objects in the first object storage, and wherein the first structured information repository and the first object storage are coupled via the cluster-internal network; a network evaluator that determines a state of one or more networks coupled to the first cluster and the remote cluster; and an intercluster repository synchronizer including a compression module that identifies one or more information objects to compress from the first object storage and transmit to the remote cluster in compressed form, wherein the compression module determines a target size of a single compressed information object, determines, based on the state of the one or more networks, whether to increase or decrease the target size of the single compressed information object, and updates the target size in accordance with the determining whether to increase or decrease the target size, wherein the compression module compresses a first information object and a second information object stored in the first object storage and combines the first and second compressed information objects, wherein a size of each of the first compressed information object and second compressed information object is smaller than the updated target size, and the single information object includes the first and second compressed information objects, and wherein the compression module transmits the single information object to the remote cluster for storage, wherein transmitting the single information object results in the duplication of the first and second information objects at the remote cluster.
地址 San Antonio TX US