摘要 |
Data deduplication is carried out in a storage system (10) in which a set of volumes of data is distributed among a plurality of servers (22). The technique comprises computing a similarity metric among volumes of the set, making a determination that a difference in the similarity metric is less than a predetermined threshold value. Responsively to the determination there is a migration of the data of the volumes of the set within their respective servers to distribute the migrated data in like manner in the respective servers. Thereafter data deduplication is performed on the respective servers. |