发明名称 Speech synthesis using perceptual linear prediction parameters
摘要 A method for synthesizing human using a linear mapping of a small set of coefficients that are speaker-independent. Preferably, the speaker-independent set of coefficients are cepstral coefficients developed during a training session using a perceptual linear predictive analysis. A linear predictive all-pole model is used to develop corresponding formants and bandwidths to which the cepstral coefficients are mapped by using a separate multiple regression model for each of the five formant frequencies and five formant bandwidths. The dual analysis produces both the cepstral coefficients of the PLP model for the different vowel-like sounds and their true formant frequencies and bandwidths. The separate multiple regression models developed by mapping the cepstral coefficients into the formant frequencies and formant bandwidths can then be applied to cepstral coefficients determined for subsequent speech to produce corresponding formants and bandwidths used to synthesize that speech. Since less data are required for synthesizing each speech segment than in conventional techniques, a reduction in the required storage space and/or transmission rate for the data required in the synthesis is achieved. In addition, the cepstral coefficients for each speech segment can be used with the regressive model for a different speaker, to produce synthesized speech corresponding to the different speaker.
申请公布号 US5165008(A) 申请公布日期 1992.11.17
申请号 US19910761190 申请日期 1991.09.18
申请人 U S WEST ADVANCED TECHNOLOGIES, INC. 发明人 HERMANSKY, HYNEK;COX, JR., LOUIS A.
分类号 G10L19/06 主分类号 G10L19/06
代理机构 代理人
主权项
地址