发明名称 Similarity analysis method, apparatus, and system
摘要 A similarity analysis method, an apparatus, and a system where the method includes acquiring file fingerprint information of a file to be analyzed, sending an analysis request that carries the file fingerprint information to at least two MDSs, selecting at least one group according to an analysis result returned by each MDS, where the analysis result includes a group number and a similarity of at least one group that has the highest similarity with the file fingerprint information and is found by the MDS, and the MDS locally queries a duplicate data block in the selected group. Hence, each MDS needs to query only a file fingerprint information set of a group that the MDS itself is responsible for, which reduces the amount of data retrieval and waiting time of reading, writing, and locking a database file.
申请公布号 US9575984(B2) 申请公布日期 2017.02.21
申请号 US201615162866 申请日期 2016.05.24
申请人 Huawei Technologies Co., Ltd. 发明人 Huang Yan
分类号 G06F7/00;G06F17/00;G06F17/30;G06F3/06 主分类号 G06F7/00
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A data de-duplicate engine (DDE), comprising: a processor; and a storage medium coupled to the processor and configured to store a program code, wherein when executing the program code, the processor is configured to: acquire file fingerprint information of a file to be analyzed;send an analysis request that carries the file fingerprint information to at least two meta data servers (MDSs), wherein the MDSs are each assigned at least one fingerprint information group, and wherein the MDSs respectively query a local file fingerprint information set in their assigned fingerprint information group according to the file fingerprint information of the file to be analyzed;receive an analysis result respectively returned by each of the MDSs based on the query, wherein the analysis result comprises a group number of the fingerprint information group which has a highest similarity value with the file fingerprint information of the file to be analyzed among the assigned fingerprint information group, and the similarity value;select at least one fingerprint information group according to the received analysis result; andsend block fingerprint information of each data block in the file to be analyzed to an MDS of the MDSs that the selected fingerprint information group belongs to, wherein the MDS that the selected fingerprint information group belongs to compares the block fingerprint information of each data block in the file to be analyzed with local block fingerprint information in the selected fingerprint information group to query a duplicate data block.
地址 Shenzhen CN