摘要 |
A method of high-speed reading in a text-to-speech conversion system including a text analysis module (101) for generating a phoneme and prosody character string from an input text; a prosody generation module (102) for generating a synthesis parameter of at least a voice segment, a phoneme duration, and a fundamental frequency for the phoneme and prosody character string; and a speech generation module (103) for generating a synthetic waveform by waveform superimposition by referring to a voice segment dictionary (105). The prosody generation module is provided with both a duration rule table containing empirically found phoneme durations and a duration prediction table containing phoneme durations predicted by statistical analysis and, when the user-designated utterance speed exceeds a threshold, uses the duration rule table and, when the threshold is not exceeded, uses the duration prediction table to determined the phoneme duration.
|