发明名称 A similarity module, a local computer, a server of a data hosting service and associated methods
摘要 A similarity module is arranged to identify whether a local data file is identical in whole or in part to existing files stored by a data hosting service, the local and existing files being hierarchically structured. The similarity module is configured to compare local file metadata with metadata of existing files and identify as candidate matches existing files which metadata matches the local file metadata to a predetermined extent. A local file checksum is then compared with checksums of the candidate existing files and if there is a match, the local file is identified as a duplicate file of the candidate existing file. If there is no match, the module compares local segment checksums with existing segment checksums from the candidate existing files, wherein the segment checksums have been generated by semantically segmenting the local and existing files following the hierarchy to divide them vertically and horizontally into segments. If the checksums of a local segment and an existing file segment match, a local segment is identified as a duplicate segment. The similarity module may be located on a local computer or on the server.
申请公布号 GB201517067(D0) 申请公布日期 2015.11.11
申请号 GB20150017067 申请日期 2015.09.28
申请人 FUJITSU LIMITED 发明人
分类号 主分类号
代理机构 代理人
主权项
地址