发明名称 Video signal processing systems and methods utilizing automated speech analysis
摘要 A method of increasing the frame rate of an image of a speaking person comprises monitoring an audio signal indicative of utterances by the speaking person and the associated video signal. The audio signal corresponds to one or more fields or frames to be reconstructed, and individual portions of the audio signal are associated with facial feature information. The facial information includes mouth formation and position information derived from phonemes or other speech-based criteria from which the position of a speaker's mouth may be reliably predicted. A field or frame of the image is reconstructed using image features extracted from the existing frame and by utilizing the facial feature information associated with a detected phoneme.
申请公布号 US6330023(B1) 申请公布日期 2001.12.11
申请号 US19940210529 申请日期 1994.03.18
申请人 AMERICAN TELEPHONE AND TELEGRAPH CORPORATION 发明人 CHEN TSUHAN
分类号 G06T9/00;G10L13/00;G10L21/06;H04N7/14;H04N7/26;H04N7/46;H04N7/52;(IPC1-7):H04N7/13 主分类号 G06T9/00
代理机构 代理人
主权项
地址