发明名称 Font recognition and font similarity learning using a deep neural network
摘要 A convolutional neural network (CNN) is trained for font recognition and font similarity learning. In a training phase, text images with font labels are synthesized by introducing variances to minimize the gap between the training images and real-world text images. Training images are generated and input into the CNN. The output is fed into an N-way softmax function dependent on the number of fonts the CNN is being trained on, producing a distribution of classified text images over N class labels. In a testing phase, each test image is normalized in height and squeezed in aspect ratio resulting in a plurality of test patches. The CNN averages the probabilities of each test patch belonging to a set of fonts to obtain a classification. Feature representations may be extracted and utilized to define font similarity between fonts, which may be utilized in font suggestion, font browsing, or font recognition applications.
申请公布号 US9501724(B1) 申请公布日期 2016.11.22
申请号 US201514734466 申请日期 2015.06.09
申请人 ADOBE SYSTEMS INCORPORATED 发明人 Yang Jianchao;Wang Zhangyang;Brandt Jonathan;Jin Hailin;Shechtman Elya;Agarwala Aseem Omprakash
分类号 G06K9/00;G06K9/36;G06K9/66;G06K9/62;G06K9/68;G06T3/40 主分类号 G06K9/00
代理机构 Shook, Hardy & Bacon L.L.P. 代理人 Shook, Hardy & Bacon L.L.P.
主权项 1. A non-transitory computer storage medium comprising computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform operations comprising: receiving one or more text images, each of the one or more text images including a corresponding font label; synthesizing the one or more text images to introduce variances in the one or more text images; generating one or more training images that include the variances; cropping the one or more training images into training patches that are utilized as an input to a convolutional neural network (CNN); training the CNN with the training patches, the CNN comprising a plurality of convolutional layers and a plurality of fully connected layers; and producing a distribution of classified text images.
地址 San Jose CA US