发明名称 System and method for multi-scale navigation of data
摘要 A system configured to generate a macro-fingerprint from at least one predefined set of summaries is provided. The system includes data storage storing a first predefined set of summaries associated with a first region of data, each member of the first predefined set of summaries characterizing data within the first region of data; and at least one processor coupled to the data storage and configured to: read the first predefined set of summaries; select at least one first member from the first predefined set of summaries based on a value of the at least one first member; and store the at least one first member within a first macro-fingerprint. The first region of data may have a first size indicative of a quantity of data included in the first region of data. The macro fingerprints are created from previously created smaller (micro) fingerprints without having to reread the data.
申请公布号 US9256611(B2) 申请公布日期 2016.02.09
申请号 US201313911482 申请日期 2013.06.06
申请人 SEPATON, INC. 发明人 Trimble Ronald Ray;Kennedy Jon Christopher
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 Mattingly & Malur, PC 代理人 Mattingly & Malur, PC
主权项 1. A method of determining duplicate data for de-duplicating data in a computer system, the method comprising: reading a first predefined set of multiple summaries associated with a first region of data in a storage of the computer system, each member of the first predefined set of multiple summaries being a micro-fingerprint value characterizing a portion of data within the first region of data; selecting a first member from the first predefined set of multiple summaries based on a value of the micro-fingerprint value of the first member; generating, at least in part, a first macro-fingerprint associated with the first region of data by storing the first member within the first macro-fingerprint; reading a second predefined set of multiple summaries associated with a set of data, each member of the second predefined set of multiple summaries being a micro-fingerprint value characterizing a portion of data within the set of data; selecting a particular member from the second predefined set of multiple summaries based on a value of the micro-fingerprint value of the particular member; generating, at least in part, a second macro-fingerprint associated with the set of data by storing the second member within the second macro-fingerprint; and comparing the first macro-fingerprint associated with the first region with the second macro-fingerprint associated with the set of data to determine, at least in part, the duplicate data.
地址 Marlborough MA US