摘要 |
A method of training a TTS (104) to assign intonational features, such as intonational phrase boundaries, to input text (110). The method of training involves taking a set of predetermined text (110) and having a human annotate it with intonational feature annotations. The text is passed through the preprocessor (120) and the phrasing module (122) wherein a set of decision nodes is generated by statistically analyzing information based upon the structure of the predetermined text. The statistical representation may then be stored and repeatedly used to generate synthesized speech, through the post processor (124), from new sets of input text without further training.
|