发明名称 Optimized data placement for individual file accesses on deduplication-enabled sequential storage systems
摘要 Data deduplication for data storage tapes comprises determining the read throughput of a deduplicated set of individual files on a single data storage tape, and determining a placement of deduplicated file data on a single data storage tape to reduce an average number of per-file gaps on the tape. Deduplicated file data is placed on the single data storage tape based on said placement to increase an average read throughput for a deduplicated set of individual files.
申请公布号 US9208820(B2) 申请公布日期 2015.12.08
申请号 US201213537851 申请日期 2012.06.29
申请人 International Business Machines Corporation 发明人 Constantinescu Mihail C.;Gharaibeh Abdullah;Lu Maohua;Pease David A.;Sharma Anurag
分类号 G06F17/30;G11B27/032;G11B5/86 主分类号 G06F17/30
代理机构 Sherman IP LLP 代理人 Sherman IP LLP ;Sherman Kenneth L.;Laut Steven
主权项 1. A method of data deduplication for data storage tapes, comprising: determining a read throughput of a deduplicated set of individual files on a single data storage tape; determining a placement of deduplicated file data on the single data storage tape based on sorting non-duplicate segments of different files around shared segments for reducing an average number of per-file gaps on the single data storage tape, wherein the sorting comprises ordering forward-related non-duplicate segments in a same file as following shared segments from largest segment to smallest segment, and ordering backward-related non-duplicate segments in the same file as previous shared segments from smallest to largest; and writing the deduplicated file data on the single data storage tape based on said placement to increase the read throughput for the deduplicated set of individual files and to reduce the average number of per-file gaps on the single data storage tape by re-duplicating deduplicated data for meeting optimization of individual file accesses.
地址 Armonk NY US