发明名称 |
Storage-network de-duplication |
摘要 |
Techniques are provided for de-duplication of data. In one embodiment, a system comprises de-duplication logic that is coupled to a de-duplication repository. The de-duplication logic is operable to receive, from a client device over a network, a request to store a file in the de-duplicated repository using a single storage encoding. The request includes a file identifier and a set of signatures that identify a set of chunks from the file. The de-duplication logic determines whether any chunks in the set are missing from the de-duplicated repository and requests the missing chunks from the client device. Then, for each missing chunk, the de-duplication logic stores in the de-duplicated repository that chunk and a signature representing that chunk. The de-duplication logic also stores, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the file identifier. |
申请公布号 |
US9558201(B2) |
申请公布日期 |
2017.01.31 |
申请号 |
US201414149762 |
申请日期 |
2014.01.07 |
申请人 |
VMware, Inc. |
发明人 |
Ben-Shaul Israel Zvi;Vasetsky Leonid |
分类号 |
G06F17/30;G06F3/06;G06F11/14 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method comprising:
receiving a request to store a file in a de-duplicated repository of a management system, wherein the de-duplicated repository is stored on physical disk blocks that have a fixed size, and wherein the request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; identifying a first chunk, from the set of chunks, that is not stored in the de-duplicated repository; storing the first chunk and a first signature, from the set of signatures, that represents the first chunk in the de-duplicated repository, wherein the first chunk is generated as a variable-sized chunk based on the fixed size of the physical disk blocks such that each variable-sized chunk is no larger than the fixed size of the physical disk blocks; and storing a second chunk and a second signature, from the set of signatures, that represents the second chunk in the de-duplicated repository, wherein the second chunk is generated such that a combined size of the first chunk and the second chunk is no larger than the fixed size of the physical disk blocks. |
地址 |
Palo Alto CA US |