发明名称 HASH BASED DE-DUPLICATION IN A STORAGE SYSTEM
摘要 A method for de-duplication, the method may include receiving a request to store in a storage system a received data entity; obtaining a received data entity signature that is responsive to the received data entity; selecting a selected data structure out of a set of data structures that comprises K data structures; wherein K is a positive integer; wherein for each value of a variable k that ranges between 2 and K, a stored data entity signature that is stored in a k'th data structure out of the set collided with stored data entity signatures that are stored in each one of a first till (k−1)'th data structures of the set; calculating an index by applying, on the received data entity signature, a hash function that is associated with the selected data structure; determining whether an entry that is associated with the index and belongs to the selected data structure is empty; writing to the entry, if the entry is empty, the received data entity signature, and storing the received data entity in the storage system in response to a location of the entry in the set; selecting, if (a) the entry is not empty and (b) the received data entity signature differs from a stored data entity signature that is stored in the entry, a new data structure of the set, and repeating at least the stages of calculating and determining.
申请公布号 US2015370835(A1) 申请公布日期 2015.12.24
申请号 US201414312724 申请日期 2014.06.24
申请人 Infinidat LTD. 发明人 Yochai Yechiel
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for de-duplication, the method comprises: receiving a request to store in a storage system a received data entity; obtaining a received data entity signature that is responsive to the received data entity; selecting a selected data structure out of a set of data structures that comprises K data structures; wherein K is a positive integer; wherein for each value of a variable k that ranges between 2 and K, a stored data entity signature that is stored in a k'th data structure out of the set collided with stored data entity signatures that are stored in each one of a first till (k−1)'th data structures of the set; calculating an index by applying, on the received data entity signature, a hash function that is associated with the selected data structure; determining whether an entry that is associated with the index and belongs to the selected data structure is empty; writing to the entry, if the entry is empty, the received data entity signature, and storing the received data entity in the storage system in response to a location of the entry in the set; selecting, if (a) the entry is not empty and (b) the received data entity signature differs from a stored data entity signature that is stored in the entry, a new data structure of the set, and repeating at least the stages of calculating and determining.
地址 Herzliya IL