发明名称 SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT
摘要 A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.
申请公布号 US2017083623(A1) 申请公布日期 2017.03.23
申请号 US201615080501 申请日期 2016.03.24
申请人 QUALCOMM Incorporated 发明人 HABIBIAN Amirhossein;MENSINK Thomas Edgar Josef;SNOEK Cornelis Gerardus Maria
分类号 G06F17/30;G06N99/00 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method of embedding a video for a text search, comprising: jointly optimizing a semantic predictability and a semantic descriptiveness by: learning the embedding based at least in part on terms included in a query; andlearning the embedding based at least in part on a multimodal analysis of the video.
地址 San Diego CA US
您可能感兴趣的专利