摘要 |
A speech synthesizer has a language generator for generating a text-form utterance from input semantic information and a text-to-speech converter for converting the text-from utterance into speech form. The overall quality of the speech-form utterance produced by the text-to-speech converter, is assessed and if judged inadequate, the language generator is triggered to produce a new version of the text-form utterance. The assessment of the overall quality of the speech form utterance is preferably effected by a classifier fed with feature values generated during the conversion process operated by the text-to-speech converter. |