发明名称 Stream-based data deduplication using directed cyclic graphs to facilitate on-the-wire compression
摘要 Stream-based data deduplication is provided in a multi-tenant shared infrastructure but without requiring “paired” endpoints having synchronized data dictionaries. Data objects processed by the dedupe functionality are treated as objects that can be fetched as needed. As such, a decoding peer does not need to maintain a symmetric library for the origin. Rather, if the peer does not have the chunks in cache that it needs, it follows a conventional content delivery network procedure to retrieve them. In this way, if dictionaries between pairs of sending and receiving peers are out-of-sync, relevant sections are then re-synchronized on-demand. The approach does not require that libraries maintained at a particular pair of sender and receiving peers are the same. Rather, the technique enables a peer, in effect, to “backfill” its dictionary on-the-fly. On-the-wire compression techniques are provided to reduce the amount of data transmitted between the peers.
申请公布号 US2014189070(A1) 申请公布日期 2014.07.03
申请号 US201314139902 申请日期 2013.12.24
申请人 Akamai Technologies, Inc. 发明人 Gero Charles E.
分类号 H04L29/08 主分类号 H04L29/08
代理机构 代理人
主权项 1. Apparatus operative in a data deduplication system, the system comprising a sending peer entity, and a receiving peer entity, wherein each entity supports a deduplication engine that provides stream-based data deduplication by examining data that flows through the peer entity and replacing blocks of the data with references that point into a data dictionary, the apparatus comprising: a data structure in association with the sending peer entity, the data structure representing temporal and ordered relationships among blocks of data that have been seen in the data stream by the sending peer entity; and an encoding mechanism that uses information in the data structure to replace one or more references to blocks of data that have been seen in the data stream by the sending peer entity by a compact data representation.
地址 Cambridge MA US