Synthesis of a moving picture of a face (e.g. to accompany synthetic speech) is performed by converting an input phoneme string into a sequence of mouth shapes or visemes. Specifically a shape is generated for each vowel and for each transition involving a consonant.