发明名称 |
EFFICIENT CALCULATION OF SIMILARITY SEARCH VALUES AND DIGEST BLOCK BOUNDARIES FOR DATA DEDUPLICATION |
摘要 |
For efficient calculation of both similarity search values and boundaries of digest blocks in data deduplication, input data is partitioned into chunks, and for each chunk a set of rolling hash values is calculated. A single linear scan of the rolling hash values is used to produce both similarity search values and boundaries of the digest blocks of the chunk. |
申请公布号 |
US2014279952(A1) |
申请公布日期 |
2014.09.18 |
申请号 |
US201313840094 |
申请日期 |
2013.03.15 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
AKIRAV Shay H.;ARONOVICH Lior;BEN-DOR Shira;HIRSCH Michael;LENEMAN Ofer |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method for efficient calculation of both similarity search values and boundaries of digest blocks in a data deduplication system using a processor device in a computing environment, comprising:
partitioning input data into data chunks; calculating a set of rolling hash values for each of the data chunks; and using a single linear scan of the rolling hash values for producing both the similarity search values and the boundaries of the digest blocks. |
地址 |
Armonk NY US |