发明名称 |
HASH BASED DE-DUPLICATION IN A STORAGE SYSTEM |
摘要 |
A method for de-duplication, the method may include receiving a request to store in a storage system a received data entity; obtaining a received data entity signature that is responsive to the received data entity; selecting a selected data structure out of a set of data structures that comprises K data structures; wherein K is a positive integer; wherein for each value of a variable k that ranges between 2 and K, a stored data entity signature that is stored in a k'th data structure out of the set collided with stored data entity signatures that are stored in each one of a first till (k−1)'th data structures of the set; calculating an index by applying, on the received data entity signature, a hash function that is associated with the selected data structure; determining whether an entry that is associated with the index and belongs to the selected data structure is empty; writing to the entry, if the entry is empty, the received data entity signature, and storing the received data entity in the storage system in response to a location of the entry in the set; selecting, if (a) the entry is not empty and (b) the received data entity signature differs from a stored data entity signature that is stored in the entry, a new data structure of the set, and repeating at least the stages of calculating and determining. |
申请公布号 |
US2015370835(A1) |
申请公布日期 |
2015.12.24 |
申请号 |
US201414312724 |
申请日期 |
2014.06.24 |
申请人 |
Infinidat LTD. |
发明人 |
Yochai Yechiel |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method for de-duplication, the method comprises:
receiving a request to store in a storage system a received data entity; obtaining a received data entity signature that is responsive to the received data entity; selecting a selected data structure out of a set of data structures that comprises K data structures; wherein K is a positive integer; wherein for each value of a variable k that ranges between 2 and K, a stored data entity signature that is stored in a k'th data structure out of the set collided with stored data entity signatures that are stored in each one of a first till (k−1)'th data structures of the set; calculating an index by applying, on the received data entity signature, a hash function that is associated with the selected data structure; determining whether an entry that is associated with the index and belongs to the selected data structure is empty; writing to the entry, if the entry is empty, the received data entity signature, and storing the received data entity in the storage system in response to a location of the entry in the set; selecting, if (a) the entry is not empty and (b) the received data entity signature differs from a stored data entity signature that is stored in the entry, a new data structure of the set, and repeating at least the stages of calculating and determining. |
地址 |
Herzliya IL |