摘要 |
The disclosure describes methods for synthesis of accurate visible speech using transformations of motion-capture data. Methods are provided for synthesis of visible speech in a three-dimensional face. A sequence of visemes, each associated with one or more phonemes, are mapped onto a three-dimensional target face, and concatentated. The sequence may include divisemes corresponding to pairwise sequences of phonemes, wherein the diviseme is comprised of motion trajectories of a set facial points. The sequence may also include multi-units corresponding to words and sequences of words. Various techniques involving mapping and concatenation are also addressed.
|