发明名称 Compact face representation
摘要 A deep learning framework jointly optimizes the compactness and discriminative ability of face representations. The compact representation can be as compact as 32 bits and still produce highly discriminative performance. In another aspect, based on the extreme compactness, traditional face analysis tasks (e.g. gender analysis) can be effectively solved by a Look-Up-Table approach given a large-scale face data set.
申请公布号 US9400918(B2) 申请公布日期 2016.07.26
申请号 US201414375668 申请日期 2014.05.29
申请人 Beijing Kuangshi Technology Co., Ltd. 发明人 Yin Qi;Cao Zhimin;Jiang Yuning;Fan Haoqiang
分类号 G06K9/00;G06K9/46;G06K9/66 主分类号 G06K9/00
代理机构 Fenwick & West LLP 代理人 Fenwick & West LLP
主权项 1. A computer-implemented method for training a deep learning neural network for compact face representations, the method comprising: presenting face images to the neural network, wherein the neural network is a pyramid convolutional neural network (CNN) comprising at least N shared layers where N≧2 and at least one unshared network coupled to the Nth shared layer; the neural network processing the face images to produce compact representations of the face images, wherein the compact representations have not more than 64 dimensions; processing the compact representations to produce estimates of a metric, for which actual values of the metric are known; and training the neural network based on the estimates of the metric compared to the actual values of the metric, wherein training the pyramid CNN comprises: training CNN levels 1 to N in that order, wherein CNN level n comprises an input for receiving the face images, the first n shared layers of the pyramid CNN, the unshared network of the pyramid CNN, and an output producing the compact representations of the face images; wherein the input is coupled to a first of the n shared layers; each shared layer includes convolution, non-linearity and down-sampling; an nth of the n shared layers is coupled to the unshared network; and the unshared network is coupled to the output,wherein training CNN level n comprises: presenting face images to the input, each face image producing the corresponding compact representation at the output,processing the compact representations to produce estimates of a metric, for which actual values of the metric are known, andadapting the nth shared layer and the unshared network based on the estimates of the metric compared to the actual values of the metric.
地址 Beijing CN