发明名称 Method and apparatus for content-aware and adaptive deduplication
摘要 A method, a system, an apparatus, and a computer readable medium for transmission of data across a network are disclosed. The method includes receiving a data stream, analyzing the received data stream to determine a starting location and an ending location of each zone within the received data stream, based on the starting and ending locations, generating a zone stamp identifying the zone, the zone stamp includes a sequence of contiguous characters representing at least a portion of data in the zone, wherein the order of characters in the zone stamp corresponds to the order of data in the zone, comparing the zone stamp with another zone stamp of another zone in any data stream received, determining whether the zone is substantially similar to another zone by detecting that the zone stamp is substantially similar to another zone stamp, delta-compressing zones within any data stream received that have been determined to have substantially similar zone stamps, thereby deduplicating zones having substantially similar zone stamps within any data stream received, and transmitting the deduplicated zones across the network from one storage location to another storage location.
申请公布号 US9223794(B2) 申请公布日期 2015.12.29
申请号 US201414444700 申请日期 2014.07.28
申请人 Exagrid Systems, Inc. 发明人 Therrien David G.;Thompson David Andrew
分类号 G06F17/30;H03M7/30;H04L29/08;G06F11/14 主分类号 G06F17/30
代理机构 Mintz Levin Cohn Ferris Glovsky and Popeo, P.C. 代理人 Mintz Levin Cohn Ferris Glovsky and Popeo, P.C.
主权项 1. A system, comprising a first deduplication and storage appliance including a first deduplication processor and a first memory communicatively coupled to the first deduplication processor; the first deduplication and storage appliance receiving a data stream from at least one server in a plurality of servers, the plurality of servers being communicatively coupled to the first deduplication and storage appliance, the data stream including a plurality of zones, each zone in the plurality of zone being represented by a zone stamp and being characterized by a predetermined minimum and maximum zone size and a predetermined minimum and maximum zone stamp length; the first deduplication processor delta-compressing zones in the received data stream based on a determination that a zone in the plurality of zones is substantially similar to another zone upon detecting that a zone stamp representing the zone is substantially similar to another zone stamp representing the another zone; andhas a size greater than the predetermined minimum zone size and less than the predetermined maximum size and a stamp length greater than the predetermined minimum zone stamp length; the first memory storing zones delta-compressed by the first deduplication processor.
地址 Westborough MA US