摘要 |
Stream-based data deduplication is provided in a multi-tenant shared infrastructure but without requiring "paired" endpoints having synchronized data dictionaries. In this approach, data objects processed by the dedupe functionality are treated as objects that can be fetched as needed. Because the compressed objects are treated as just objects, a decoding peer does not need to maintain a symmetric library for the origin. Rather, if the peer does not have the chunks in cache that it needs, it follows a conventional content delivery network (CDN) procedure to retrieve them. In this way, if dictionaries between pairs of sending and receiving peers are out- of-sync, relevant sections are the re- synchronized on-demand. The approach does not require that libraries maintained at a particular pair of sender and receiving peers are the same. Rather, the technique enables a peer, in effect, to "backfill" its dictionary on-the-fly. |