发明名称 Scalable mechanism for detection of commonality in a deduplicated data set
摘要 Mechanisms are provided for efficiently determining commonality in a deduplicated data set in a scalable manner regardless of the number of deduplicated files or the number of stored segments. Information is generated and maintained during deduplication to allow scalable and efficient determination of data segments shared in a particular file, other files sharing data segments included in a particular file, the number of files sharing a data segment, etc. Data need not be expanded or uncompressed. Deduplication processing can be validated and verified during commonality detection.
申请公布号 US8862559(B2) 申请公布日期 2014.10.14
申请号 US200912574580 申请日期 2009.10.06
申请人 Dell Products L.P. 发明人 Jayaraman Vinod
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Kwan & Olynick LLP 代理人 Kwan & Olynick LLP
主权项 1. A method, comprising: generating a filemap corresponding to a deduplicated file using a processor included in a deduplication system, the filemap including a plurality of filemap indices, a plurality of offsets to identify a plurality of data segments in the deduplicated file, and a plurality of entries identifying last files having placed a reference to corresponding data segments in the deduplicated file; modifying a datastore suitcase, the datastore suitcase including an index portion and a data portion, the data portion holding a plurality of datastore indices corresponding to the filemap indices, a plurality of deduplicated data segments, and a last file entry identifying last files having placed a reference to deduplicated data segments, wherein the datastore suitcase is created when the processor processes a file for deduplication.
地址 Round Rock TX US