摘要 |
Data segments are logically organized in groups in a data repository. Each segment is stored at an index in the data repository. In association with a write request, a hash algorithm is applied to the data segment to generate a group identifier. Each group is identifiable by a corresponding group identifier. The group identifier is applied to a hash tree to determine whether a corresponding group in the data repository exists. Each existing group in the data repository corresponds to a leaf of the hash tree. If no corresponding group exists in the data repository, the data segment is stored in a new group in the data repository. However, if a corresponding group exists, the group is further searched to determine if a data segment matching the data segment to be stored is already stored. The data segment can be stored in accordance with the results of the search. |