发明名称 Classifying Data Objects
摘要 Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.
申请公布号 US2015178383(A1) 申请公布日期 2015.06.25
申请号 US201414576907 申请日期 2014.12.19
申请人 Google Inc. 发明人 Corrado Gregory Sean;Mikolov Tomas;Bengio Samy;Singer Yoram;Shlens Jonathon;Frome Andrea L.;Dean Jeffrey Adgate;Norouzi Mohammad
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method performed by one or more computers, the method comprising: obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term, wherein the high-dimensional representation of the term is a numeric representation of the term in a high-dimensional space; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, wherein the respective score for each of the plurality of categories represents a likelihood that the data object belongs to the category, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.
地址 Mountain View CA US