发明名称 |
Classifying Data Objects |
摘要 |
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object. |
申请公布号 |
US2015178383(A1) |
申请公布日期 |
2015.06.25 |
申请号 |
US201414576907 |
申请日期 |
2014.12.19 |
申请人 |
Google Inc. |
发明人 |
Corrado Gregory Sean;Mikolov Tomas;Bengio Samy;Singer Yoram;Shlens Jonathon;Frome Andrea L.;Dean Jeffrey Adgate;Norouzi Mohammad |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method performed by one or more computers, the method comprising:
obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term, wherein the high-dimensional representation of the term is a numeric representation of the term in a high-dimensional space; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, wherein the respective score for each of the plurality of categories represents a likelihood that the data object belongs to the category, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object. |
地址 |
Mountain View CA US |