发明名称 Storage-network de-duplication
摘要 Techniques are provided for de-duplication of data. In one embodiment, a system comprises de-duplication logic that is coupled to a de-duplication repository. The de-duplication logic is operable to receive, from a client device over a network, a request to store a file in the de-duplicated repository using a single storage encoding. The request includes a file identifier and a set of signatures that identify a set of chunks from the file. The de-duplication logic determines whether any chunks in the set are missing from the de-duplicated repository and requests the missing chunks from the client device. Then, for each missing chunk, the de-duplication logic stores in the de-duplicated repository that chunk and a signature representing that chunk. The de-duplication logic also stores, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the file identifier.
申请公布号 US9558201(B2) 申请公布日期 2017.01.31
申请号 US201414149762 申请日期 2014.01.07
申请人 VMware, Inc. 发明人 Ben-Shaul Israel Zvi;Vasetsky Leonid
分类号 G06F17/30;G06F3/06;G06F11/14 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method comprising: receiving a request to store a file in a de-duplicated repository of a management system, wherein the de-duplicated repository is stored on physical disk blocks that have a fixed size, and wherein the request includes an identifier of the file and a set of signatures that respectively identify a set of chunks from the file; identifying a first chunk, from the set of chunks, that is not stored in the de-duplicated repository; storing the first chunk and a first signature, from the set of signatures, that represents the first chunk in the de-duplicated repository, wherein the first chunk is generated as a variable-sized chunk based on the fixed size of the physical disk blocks such that each variable-sized chunk is no larger than the fixed size of the physical disk blocks; and storing a second chunk and a second signature, from the set of signatures, that represents the second chunk in the de-duplicated repository, wherein the second chunk is generated such that a combined size of the first chunk and the second chunk is no larger than the fixed size of the physical disk blocks.
地址 Palo Alto CA US