发明名称 System and method for automatic detection of abnormal stress patterns in unit selection synthesis
摘要 Disclosed herein are systems, methods, and non-transitory computer-readable storage media for detecting and correcting abnormal stress patterns in unit-selection speech synthesis. A system practicing the method detects incorrect stress patterns in selected acoustic units representing speech to be synthesized, and corrects the incorrect stress patterns in the selected acoustic units to yield corrected stress patterns. The system can further synthesize speech based on the corrected stress patterns. In one aspect, the system also classifies the incorrect stress patterns using a machine learning algorithm such as a classification and regression tree, adaptive boosting, support vector machine, and maximum entropy. In this way a text-to-speech unit selection speech synthesizer can produce more natural sounding speech with suitable stress patterns regardless of the stress of units in a unit selection database.
申请公布号 US9269348(B2) 申请公布日期 2016.02.23
申请号 US201514628790 申请日期 2015.02.23
申请人 AT&T Intellectual Property I, L.P. 发明人 Kim Yeon-Jun;Beutnagel Mark Charles;Conkie Alistair D.;Syrdal Ann K.
分类号 G10L13/00;G10L13/08;G10L13/10;G10L15/18;G10L25/00 主分类号 G10L13/00
代理机构 代理人
主权项 1. A method comprising: receiving a stress pattern for both a language and an accent in the language; detecting, according to the stress pattern, incorrect stress patterns in selected acoustic units representing speech to be synthesized, wherein the selected acoustic units were selected by a separate unit-selection speech synthesizer; performing an analysis of the incorrect stress patterns, wherein the analysis comprises a word level analysis, a phrase level analysis, and a sentence level analysis on the incorrect stress patterns, wherein the word level analysis, the phrase level analysis, and the sentence level analysis are performed in series; and modifying, via a processor and prior to waveform synthesis, the incorrect stress patterns in the selected acoustic units according to the analysis, to yield corrected stress patterns, wherein the corrected stress patterns conform to the stress pattern for the language.
地址 Atlanta GA US