摘要 |
Deduplication of data on disk devices based on a threshold number (THN) of sequential blocks is described herein, the threshold number being two or greater. Deduplication may be performed when a series of THN or more received blocks (THN series) match a sequence of THN or more stored blocks (THN sequence), whereby a sequence comprises blocks stored on the same track of a disk device. Deduplication may be performed using a block-comparison mechanism comprising metadata entries of stored blocks and a mapping mechanism containing mappings of deduplicated blocks to their matching blocks. The mapping mechanism may be used to perform later read requests received for the deduplicated blocks. The deduplication described herein may reduce the read latency as the number of seeks between tracks may be reduced. Also, when a seek to a different track is performed, the seek time cost is spread over THN or more blocks.
|