摘要 |
PCT No. PCT/AU96/00081 Sec. 371 Date Aug. 15, 1997 Sec. 102(e) Date Aug. 15, 1997 PCT Filed Feb. 15, 1996 PCT Pub. No. WO96/25801 PCT Pub. Date Aug. 22, 1996This invention provides a method and apparatus for detecting common spans within one or more data blocks by partitioning the blocks (FIG. 4) into subblocks and searching the group of subblocks (FIG. 12) (or their corresponding hashes (FIG. 13)) for duplicates. Blocks can be partitioned into subblocks using a variety of methods, including methods that place subblock boundaries at fixed positions (FIG. 3), methods that place subblock boundaries at data-dependent positions (FIG. 3), and methods that yield multiple overlapping subblocks (FIG. 6). By comparing the hashes of subblocks, common spans of one or more blocks can be identified without ever having to compare the blocks or subblocks themselves (FIG. 13). This leads to several applications including an incremental backup system that backs up changes rather than changed files (FIG. 25), a utility that determines the similarities and differences between two files (FIG. 13), a file system that stores each unique subblock at most once (FIG. 26), and a communications system that eliminates the need to transmit subblocks already possessed by the receiver (FIG. 19).
|