发明名称 Method and system for efficiently handling small files in a single instance storage data store
摘要 A method, system and apparatus for efficient storage of small files in a segment-based deduplication scheme by allocating multiple small files to a single data segment is provided. A mechanism for distinguishing between large files (e.g., files that are on the order of the size of a segment or larger) and smaller files, and starting a new segment at the beginning of a large file is also provided. A file attribute-based system for determining an identity of a small file at which to begin a new segment and then allocating subsequent small files to that segment and contiguous segments until a next small file having an appropriate attribute subsequently is encountered to begin a new segment is further provided. In one aspect of the present invention a filename hash is used for file attribute analysis to determine when a new segment should begin. Using such a mechanism, multiple small files can be allocated to a data segment and at the same time continue to provide for efficient storage of large files within separate data segments. The file attribute analysis further provides for an increase in deduplication rate for subsequently provided copies of the small files (e.g., in a backup) since segment boundaries remain constant in spite of file additions or deletions.
申请公布号 US8572055(B1) 申请公布日期 2013.10.29
申请号 US20080164284 申请日期 2008.06.30
申请人 SYMANTEC OPERATING CORPORATION;WU WEIBAO;ZEIS MICHAEL JOHN 发明人 WU WEIBAO;ZEIS MICHAEL JOHN
分类号 G06F7/00;G06F17/00 主分类号 G06F7/00
代理机构 代理人
主权项
地址