摘要 |
An improved sliding window chunking apparatus and method comprising comparing a fingerprint value of each position in a data set to a second set of criteria, at least in instances when it doesn't satisfy a first set of criteria, and, if the value satisfies the second set of criteria, identifying the position as a potential breakpoint. Subsequently, if a fingerprint value that satisfies the first set of criteria is not found before a maximum chunk size is reached, the potential breakpoint can be designated as a breakpoint. Further improvement is possible by imposing minimum and maximum sizes on chunks. In some instances, more than two sets of criteria may be used to identify additional potential chunks to be used should subsets having fingerprint values satisfying either of the first two sets of criteria not be found.
|