发明名称 Secure distributed deduplication in encrypted data storage
摘要 In an encrypted storage system employing data deduplication, encrypted data units are stored with the respective keyed data digests. A secure equivalence process is performed to determine whether an encrypted data unit on one storage unit is a duplicate of an encrypted data unit on another storage unit. The process includes an exchange phase and a testing phase in which no sensitive information is exposed outside the storage units. If duplication is detected then the duplicate data unit is deleted from one of the storage units and replaced with a mapping to the encrypted data unit as stored on the other storage unit. The mapping is used at the one storage unit when the corresponding logical data unit is accessed there.
申请公布号 US8930687(B1) 申请公布日期 2015.01.06
申请号 US201313833517 申请日期 2013.03.15
申请人 EMC Corporation 发明人 Robinson Peter Alan;Young Eric
分类号 G06F12/14 主分类号 G06F12/14
代理机构 BainwoodHuang 代理人 BainwoodHuang
主权项 1. A method of providing data deduplication across first and second storage devices in an encrypted storage system, comprising: storing respective first and second data units along with respective first and second keyed data digests of the first and second data units at the first and second storage devices, the first and second data units encrypted under respective distinct data encryption keys, the first and second keyed data digests calculated from the respective first and second data units and a data digest key; engaging in a secure equivalence detection process between the first and second storage devices to determine whether the first data unit stored at the first storage device is a duplicate of the second data unit stored at the second storage device, the process employing two distinct asymmetric key pairs having respective first and second public keys, both key pairs being members of one mathematical prime group having a modulus and a generator, the process including: an exchange phase including (1) at each of the first and second storage devices, calculating respective first and second products from the respective first and second keyed data digests and the respective first and second public keys and providing the respective first and second products to the second and first storage devices respectively, (2) at the first storage device, calculating a first quotient and a first hash and providing the first hash to the second storage device, the first quotient calculated from the first keyed data digest and first public key and the second product, the first hash calculated as a message digest of the first quotient combined with the first and second products, and (3) at the second storage device, calculating a second quotient and second hash and providing the second hash to the first storage device, the second quotient calculated from the second keyed data digest and second public key and the first product, the second hash calculated as a message digest of the second quotient combined with the first hash; and a testing phase including one or both of (1) at the second storage device, calculating a first candidate hash and comparing it against the first hash from the first storage device, the first candidate hash calculated as a message digest of the second quotient combined with the first and second products, the comparing generating a second-unit indication whether the second data unit is a duplicate of the first data unit, and (2) at the first storage device, calculating a second candidate hash and comparing it against the second hash from the second storage device, the second candidate hash calculated as a message digest of the first quotient combined with the second hash, the comparing generating a first-unit indication whether the first data unit is a duplicate of the second data unit; andbased upon the first-unit indication and/or the second-unit indication, deleting the data unit at the respective first and second storage devices and creating a respective mapping between an identifier of the respective first or second data unit at the respective first or second storage device and the respective second or first data unit stored in the respective second or first storage device,wherein the first and second storage devices have different data access characteristics to collectively provide storage over different phases of a data lifecycle, and wherein deduplication is performed as part of migrating the respective first or second data unit from the respective first or second storage devices to the respective second or first storage device.
地址 Hopkinton MA US
您可能感兴趣的专利