发明名称 Crowd sourcing for file recognition
摘要 Methods for identifying content in encrypted or otherwise protected files utilize crowd sourcing for content identification. One such method includes, using a computer, selecting defined content titles to be presented with identifiers for data files for use in obtaining user selection data. The method may also include receiving the user selection data from multiple independent sources, the user selection data indicating users' selections of single ones of the content titles for respective single ones of the data files. The method may also include determining for ones of the identifiers, using the one or more computers processing the user selection data, respective ones of the content titles satisfying a minimum confidence threshold for association with the ones of the identifiers. An apparatus for performing the method comprises a processor coupled to a memory, the memory holding instructions for performing steps of the method as summarized above.
申请公布号 US9626456(B2) 申请公布日期 2017.04.18
申请号 US201012901321 申请日期 2010.10.08
申请人 WARNER BROS. ENTERTAINMENT INC. 发明人 Kozan Kevin Michael
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Snell & Wilmer L.L.P. 代理人 Jaech Jonathan;Snell & Wilmer L.L.P.
主权项 1. A method for identifying encrypted media content contained within each file of a library of encrypted data files without decrypting the data files, comprising: selecting, by one or more computers, content titles from a database of content titles to be presented with identifiers for the data files, including selecting multiple different ones of the content titles for singular ones of the identifiers, wherein each of the identifiers comprises a hash of a set of unique file metadata, and each of the content titles is a character string used to uniquely identify the encrypted media content contained within the each file of the library; sending the multiple different ones of the content titles with the singular ones of the identifiers to different clients each operated by an independent user for presentation by the different clients with a request that a user identify a correct one of the multiple different ones of the content titles for corresponding ones of the data files; receiving user selection data from the multiple independent sources in response to the sending, the user selection data indicating users' selections of a user-selected correct one of the content titles for each respective one of the data files responsive to presentations of the multiple different ones of the content titles with the singular ones of the identifiers by the different clients; determining for each one of the identifiers, using the one or more computers processing the user selection data, a respective one of the content tides satisfying a minimum confidence threshold for association with the each one of the identifiers, based on at least one of a quality or quantity of the multiple independent sources supplying the user selection data for each of the content titles; and providing the respective one of the content titles satisfying the minimum confidence threshold for recording as associated with the each one of the identifiers in a data structure, the data structure is used for querying using one of the identifiers to provide an associated one of the content tides for use in identifying a data file and for use in providing one of the content titles for the data file in response to determining that the one of the content titles satisfies the minimum confidence threshold and is associated with the one of the identifiers for the data file.
地址 Burbank CA US