发明名称 Subsegmenting for efficient storage, resemblance determination, and transmission
摘要 Transmitting or storing subsegments is disclosed. A data stream or a data block is received and broken into a plurality of segments. For at least one segment, the segment is broken into a plurality of subsegments. A previously stored or transmitted segment similar to the at least one segment is identified. A fingerprint is computed for at least one subsegment. And, using the fingerprint for the at least one subsegment, determining whether the at least one subsegment is identical to a subsegment of the previously stored or transmitted segment without directly comparing the content of the at least one subsegment with the content of the subsegment of the previously stored or transmitted segment.
申请公布号 US8768895(B2) 申请公布日期 2014.07.01
申请号 US200711804578 申请日期 2007.05.18
申请人 EMC Corporation 发明人 Patterson R. Hugo;Zhu Ming Benjamin
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Van Pelt, Yi & James LLP 代理人 Van Pelt, Yi & James LLP
主权项 1. A method for transmitting or storing subsegments comprising: receiving a data stream or a data block; breaking the data stream or the data block into a plurality of segments; for at least one segment: breaking the at least one segment into a plurality of subsegments;identifying a previously stored or transmitted segment similar to but not identical to the at least one segment, wherein the identification is based at least in part on a value based on one or more of content of the at least one segment and metadata associated with the at least one segment;computing a fingerprint for at least one subsegment of the at least one segment; andusing the fingerprint for the at least one subsegment of the at least one segment, determining whether the at least one subsegment is identical to a subsegment of the previously stored or transmitted segment that has been identified as similar to but not identical to the at least one segment, wherein determining comprises determining without directly comparing the content of the at least one subsegment with the content of the subsegment of the previously stored or transmitted segment.
地址 Hopkinton MA US