发明名称 Tag selection, clustering, and recommendation for content hosting services
摘要 Content object tags at a content hosting service are used to classify stored content objects. Tags and clusters of tags (groups of one or more associated tags) can be recommended to a user of the content hosting service based on a user context, such as the browsing, viewing, uploading, or searching of content objects. Tags are scored based on content objects tagged with the tags in a targeted subset of content objects and a baseline subset of content objects, and based on the relevance of the content objects tagged with the tags. These tag scores can be weighted, and one or more tags can be selected for recommendation based on the weighted tag scores. Tag clusters can be selected for recommendation using a cluster hierarchy and determining whether a targeted subset of tags occur within a maximum number of tag clusters at a particular hierarchy level.
申请公布号 US9519685(B1) 申请公布日期 2016.12.13
申请号 US201314014776 申请日期 2013.08.30
申请人 deviantArt, Inc. 发明人 McCann Andrew Simz Arneson;Donaldson Roger David
分类号 G06F7/00;G06F17/30;G06F15/16 主分类号 G06F7/00
代理机构 Fenwick & West LLP 代理人 Fenwick & West LLP
主权项 1. A computer implemented method of selecting content object tags for recommendation to a user of a content hosting service, the method comprising: identifying a baseline subset of content objects within a content object corpus at the content hosting service based on a user context at the content hosting service; identifying a targeted subset of the baseline subset of content objects based on the user context, wherein each content object in the targeted subset of content objects is associated with one or more tags; for each tag associated with the targeted subset of content objects: determining a targeted subset count score for the tag based on the number of content objects in the targeted subset of content objects tagged with the tag;determining a frequency normalization score for the tag based on the proportion of the targeted subset of content objects that are tagged with the tag relative to the proportion of the baseline subset of content objects that are tagged with the tag;determining a distribution score for the tag based on 1) a first ratio comprising a number of content objects in a top-ranked portion of the targeted subset of content objects to the total number of content objects in the targeted subset of content objects, 2) a second ratio comprising a number of content objects in the top-ranked portion of the targeted subset of content objects that are tagged with the tag to the total number of content objects in the targeted subset of content objects that are tagged with the tag, and 3) a third ratio based on a logarithm of the first ratio and a logarithm of the second ratio; andcomputing a weighted tag score for the tag by combining all of the targeted subset count score for the tag, the frequency normalization score for the tag, the distribution score for the tag, and one or more associated weight coefficients; and selecting one or more tags for recommendation to the user based on the determined weighted tag scores.
地址 Los Angeles CA US