发明名称 Identifying modified chunks in a data set for storage
摘要 Provided are a computer program product, system, and method for identifying modified chunks in a data set for storage. Information is maintained on a data set of variable length chunks, including a digest of each chunk and information to locate the chunk in the data set. Modifications are received to at least one of the chunks in the data set. A determination is made of chunks including data affected by the modifications. The determined chunks including data affected by the modifications are processed to determine new chunks and for each determined new chunk and for each determined new chunk, new digest information of the new chunk. The new digest information on the at least one new chunk and information to locate the new chunk in the data set are added to the set information.
申请公布号 US9110603(B2) 申请公布日期 2015.08.18
申请号 US201314103712 申请日期 2013.12.11
申请人 International Business Machines Corporation 发明人 Yakushev Mark L.;Smith Mark A.
分类号 G06F7/00;G06F3/06;G06F17/30 主分类号 G06F7/00
代理机构 Konrad, Raynes, Davda and Victor LLP 代理人 Victor David W.;Konrad, Raynes, Davda and Victor LLP
主权项 1. A computer program product for processing modifications to a data set in storage, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied therein that executes to perform operations, the operations comprising: maintaining information on a data set of variable length chunks, including a digest of each chunk and information to locate the chunk in the data set; receiving modifications to at least one chunk in a range of chunks in the data set; determining the at least one chunk in the range including data affected by the modifications in the range of chunks; processing the determined at least one chunk in the range including data affected by the modifications to determine at least one new chunk in the range having a different layout in the range than the at least one determined chunk before the modifications, wherein the at least one new chunk can have more or less data than the determined at least one chunk had before the modifications; for each of the at least one determined new chunk, determining new digest information of the new chunk; and adding to the set information the new digest information on the at least one new chunk and information to locate the new chunk in the data set.
地址 Armonk NY US