发明名称 Method and system for near-duplicate image searching
摘要 Image processing includes dividing the plurality of images into a plurality of groups wherein images in the same group share the same main color; extracting a color feature vector (CFV) of each image in the plurality of groups; subdividing images in each of the plurality of groups into a plurality of subgroups using a clustering technique according to a distance between the CFVs of the images in the group to establish an image signature tree; searching among the plurality of subgroups for a result-subgroup having the same main color as the main color of a given image and containing an image whose CFV has the shortest distance from the CFV of the given image; comparing the CFV of the given image with the CFVs in the result group; and identifying a near-duplicate image from the result group that meets a preset near-duplicate image determining condition.
申请公布号 US9405993(B2) 申请公布日期 2016.08.02
申请号 US201314080537 申请日期 2013.11.14
申请人 Alibaba Group Holdings Limited 发明人 Jia Menglei
分类号 G06K9/00;G06K9/62;G06K9/46;G06F17/30 主分类号 G06K9/00
代理机构 Van Pelt, Yi & James LLP 代理人 Van Pelt, Yi & James LLP
主权项 1. An image processing method, comprising: dividing a plurality of images into a plurality of groups wherein images in a same group share a same main color; extracting a color feature vector (CFV) of each image in the plurality of groups; subdividing images in each of the plurality of groups into a plurality of subgroups using a clustering technique according to distances between the CFVs of the images in the group, wherein subdividing the images into a plurality of subgroups comprises: setting a first group of the plurality of groups as a current image group;setting a main color of the images in the current image group as a root node of a subtree of an image signature tree, and setting the root node as a current parent node; andperforming recursive division of the images in the current image group, comprising: dividing the CFVs of the images in the current image group into K subgroups, using the clustering technique according to distances between the CFVs of the images in the current image group, wherein K is an integer greater than 1;setting a clustering center of the CFVs of a first subgroup of the K subgroups as a first child node of the current parent node, setting the first subgroup as the current image group, and setting the first child node as the current parent node in the event that the first subgroup does not meet a predetermined grouping stop condition; andsetting the images corresponding to the CFVs of a first subgroup of the K subgroups as the child nodes of the current parent node, and selecting the first subgroup as one of the plurality of subgroups comprising images which are obtained using the clustering technique according to the distances between the CFVs of the images in the event that the first subgroup meets the predetermined grouping stop condition; searching, using one or more computer processors, among the plurality of subgroups for a result subgroup having a same main color as a main color of a given image and comprising an image whose CFV has a shortest distance from the CFV of the given image; comparing the CFV of the given image with the CFVs in the result subgroup; and identifying a near-duplicate image from the result subgroup that meets a preset near-duplicate image determining condition.
地址 KY