摘要 |
<p><P>PROBLEM TO BE SOLVED: To generate a voice smoothly representing a plurality of feeling variations with a small number of pieces of learning data. <P>SOLUTION: A feeling information extraction part 6 recognizes the meaning of an input text to extract various feeling intensity. A feeling control information conversion part 7 converts feeling control information being secular changes of feeling intensity into parameter conversion information. A feeling input interface part 9 converts directly inputted feeling control information into the parameter conversion information. A feeling control part 8 converts the parameter conversion information into reference parameters according to a conversion rule. A rhythm control part 3 converts a rhythm pattern based upon the input text into a feeling rhythm pattern according to the reference parameters. A parameter control part 4 converts the parameters based upon the input text into feeling parameters according to the reference parameters. A speech synthesis part 5 connects phoneme pieces according to the rhythm pattern having been converted and the parameters. Consequently, a synthesized speech is generated which can smoothly represent a continuously varying feeling. <P>COPYRIGHT: (C)2003,JPO</p> |