Efficient calculation of similarity search values and digest block boundaries for data deduplication,申请号US201313840094-传众专利搜索

发明名称	Efficient calculation of similarity search values and digest block boundaries for data deduplication
摘要	For efficient calculation of both similarity search values and boundaries of digest blocks in data deduplication, input data is partitioned into chunks, and for each chunk a set of rolling hash values is calculated. A single linear scan of the rolling hash values is used to produce both similarity search values and boundaries of the digest blocks of the chunk.
申请公布号	US9244937(B2)	申请公布日期	2016.01.26
申请号	US201313840094	申请日期	2013.03.15
申请人	INTERNATIONAL BUSINESS MACHINES CORPORATION	发明人	Akirav Shay H.;Aronovich Lior;Ben-Dor Shira;Hirsch Michael;Leneman Ofer
分类号	G06F17/30	主分类号	G06F17/30
代理机构	Griffiths & Seaton PLLC	代理人	Griffiths & Seaton PLLC
主权项	1. A method for efficient calculation of both similarity search values and boundaries of digest blocks in a data deduplication system using a processor device in a computing environment, comprising: partitioning input data into data chunks; calculating a set of rolling hash values for each of the data chunks; using a single linear scan of the rolling hash values for producing both the similarity search values and the boundaries of the digest blocks; using each of the rolling hash values to contribute to the calculation of the similarity search values and to the calculation of the boundaries of the digest blocks; and discarding each of the rolling hash values after contributing to the calculation of the similarity search values and to the calculation of the boundaries of the digest blocks.
地址	Armonk NY US