主权项 |
1. A method of determining duplicate data for de-duplicating data in a computer system, the method comprising:
reading a first predefined set of multiple summaries associated with a first region of data in a storage of the computer system, each member of the first predefined set of multiple summaries being a micro-fingerprint value characterizing a portion of data within the first region of data; selecting a first member from the first predefined set of multiple summaries based on a value of the micro-fingerprint value of the first member; generating, at least in part, a first macro-fingerprint associated with the first region of data by storing the first member within the first macro-fingerprint; reading a second predefined set of multiple summaries associated with a set of data, each member of the second predefined set of multiple summaries being a micro-fingerprint value characterizing a portion of data within the set of data; selecting a particular member from the second predefined set of multiple summaries based on a value of the micro-fingerprint value of the particular member; generating, at least in part, a second macro-fingerprint associated with the set of data by storing the second member within the second macro-fingerprint; and comparing the first macro-fingerprint associated with the first region with the second macro-fingerprint associated with the set of data to determine, at least in part, the duplicate data. |