A telephone system is described in which subscriber telephones store appearance models for the appearance of a party to the telephone call, from which it synthesises a video sequence of that party from a set of appearance parameters received from the telephone network. The appearance parameters may be generated either from a camera associated with the user's phone or may be generated from text or speech signals input by that party.