发明名称 System and method for digital video retrieval involving speech recognition
摘要 Disclosed are systems, methods, and computer readable media for retrieving digital images. The method embodiment includes converting a descriptive audio stream of a digital video that is provided for the visually impaired to text and then aligning that text to the appropriate segment of the digital video. The system then indexes the converted text from the descriptive audio stream with the text's relationship to the digital video. The system enables queries using action words describing a desired scene from a digital video.
申请公布号 US9465870(B2) 申请公布日期 2016.10.11
申请号 US201514850100 申请日期 2015.09.10
申请人 AT&T Intellectual Property I, L.P. 发明人 Bangalore Srinivas
分类号 H04L29/06;G06F21/00;G06F17/30;G10L15/26;H04N21/439;H04N21/4402;H04N21/482;H04N21/8547;G09B21/00 主分类号 H04L29/06
代理机构 代理人
主权项 1. A method comprising: receiving a query for a portion of a video; identifying, via a processor, the portion of the video using an index, wherein the index is generated by aligning text from a digital audio stream associated with the video to frames in the video based on a non-textual optical analysis of the video, and wherein the aligning of the text utilizes a first bit rate associated with the frames in the video and a second bit rate associated with the digital audio stream; and presenting the portion of the video in response to the query.
地址 Atlanta GA US