摘要 |
PROBLEM TO BE SOLVED: To provide a device which can match a voice generated by a decoder as to an image encoding of a facial animation. SOLUTION: A facial animation is generated with two data sequences, i.e., a text and facial animation parameters, wherein an input text is transmitted to a text/speech converter 5 in a decoder which moves the shape of the mouth of a face, and facial animation parameters are transmitted from the encoder to face positions over a communication channel. A continuous text sent to the text/speech converter has codes called bookmarks arranged between or in words, and includes an encoder timestamp. The facial animation parameter sequence also includes the same encoder timestamp. The system reads the bookmarks and supplies the encoder timestamp and real-time timestamp to the facial animation system. |