发明名称 Method and system for optimizing accuracy-specificity trade-offs in large scale visual recognition
摘要 As visual recognition scales up to ever larger numbers of categories, maintaining high accuracy is increasingly difficult. Embodiment of the present invention include methods for optimizing accuracy-specificity trade-offs in large scale recognition where object categories form a semantic hierarchy consisting of many levels of abstraction.
申请公布号 US9158965(B2) 申请公布日期 2015.10.13
申请号 US201313831833 申请日期 2013.03.15
申请人 The Board of Trustees of the Leland Stanford Junior University 发明人 Li Fei-Fei;Deng Jia;Krause Jonathan;Berg Alexander C.
分类号 G06K9/00;G06K9/62 主分类号 G06K9/00
代理机构 KPPB LLP 代理人 KPPB LLP
主权项 1. A method for classifying images, comprising: receiving an input image to classify using a computer system; scoring a likelihood of each individual node in a plurality of nodes of a classifier using a computer system, where the classifier includes a semantic hierarchy in which the plurality of nodes correspond to a hierarchy of named entities and a set of individual object classifiers to classify a likelihood that the input image contains a named entity in one of a plurality of leaf nodes from the plurality of nodes, where the plurality of leaf nodes correspond to a set of mutually exclusive named entities in the hierarchy of named entities; selecting an individual node from the plurality of nodes most descriptive of the image using a computer system, where the individual node is determined by: iteratively estimating a reward weight within the classifier that achieves a predetermined accuracy, where the accuracy of the classifier is determined by classifying a validation data set using the estimated reward weight;determining reward weighted likelihoods using the estimated reward weight that achieves the predetermined accuracy; andselecting as the individual node most descriptive of the image the individual node within the plurality of nodes in the semantic hierarchy that has the highest reward weighted likelihood; classifying the input image as a named entity corresponding to the individual node most descriptive of the image using a computer system; and returning the named entity as a classification of the input image using a computer system.
地址 Stanford CA US