发明名称 PHOTO-REALISTIC SYNTHESIS OF IMAGE SEQUENCES WITH LIP MOVEMENTS SYNCHRONIZED WITH SPEECH
摘要 Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which a synthesized image sequence will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with lip movements synchronized with the desired speech.
申请公布号 US2012284029(A1) 申请公布日期 2012.11.08
申请号 US201113098488 申请日期 2011.05.02
申请人 WANG LIJUAN;SOONG FRANK;MICROSOFT CORPORATION 发明人 WANG LIJUAN;SOONG FRANK
分类号 G10L21/00 主分类号 G10L21/00
代理机构 代理人
主权项
地址