发明名称 Non-text content item search
摘要 Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting labels for a non-text content item. In one aspect, a method receives a set of initial labels for a non-text content item, wherein the set of initial labels specifies text that has been identified as descriptive of the non-text content item and a web page to which the text corresponds. Initial labels corresponding to sets of matching web pages are grouped into separate initial label groups that correspond to each set of matching web pages. Sets of matching labels are grouped into other separate initial label groups that correspond to the sets of matching labels. One or more words that are included in at least a threshold number of the separate label groups are selected as final labels for the non-text content item.
申请公布号 US8856125(B1) 申请公布日期 2014.10.07
申请号 US201213606336 申请日期 2012.09.07
申请人 Google Inc. 发明人 Malpani Radhika;Preetham Arcot J.;Mate Omkar
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A method performed by data processing apparatus, the method comprising: identifying a non-text content item that is associated with each of a plurality of web pages; receiving label data that includes a set of initial labels for the non-text content item, wherein each initial label includes one or more words; grouping, for each of two or more sets of matching web pages among the plurality of web pages, initial labels that are associated with the set of matching web pages into a label group, the initial labels for different set of matching web pages being grouped to different label groups; grouping different sets of matching labels from the set of initial labels into different label groups; and selecting, as a final label for the non-text content item, an n-gram of one or more words that is included in at least a threshold number of different label groups.
地址 Mountain View CA US