发明名称 ELECTRONIC DEVICES AND METHODS FOR COMPENSATING FOR ENVIRONMENTAL NOISE IN TEXT-TO-SPEECH APPLICATIONS
摘要 A method by an electronic device for compensating for environmental noise in text-to-speech (TTS) speech output includes: measuring environmental noise using a microphone signal; determining sound characteristics of the measured environmental noise; dynamically predicting expected future sound characteristics of the environmental noise based on the determined sound characteristics of the measured environmental noise; receiving a text input at a TTS engine at the device, with the TTS engine configured to convert the text input into a speech output signal; determining text characteristics of the text input at the TTS engine; and at the TTS engine, dynamically adapting the speech output signal based on the determined text characteristics of the text input and the predicted expected future sound characteristics of the environmental noise.
申请公布号 US2016275936(A1) 申请公布日期 2016.09.22
申请号 US201314374170 申请日期 2013.12.17
申请人 Sony Corporation 发明人 Thorn Ola
分类号 G10L13/033;G10L15/06;G10L21/0216;G10L13/047;G10L13/08;G10L21/003 主分类号 G10L13/033
代理机构 代理人
主权项 1. A method by an electronic device for compensating for environmental noise in text-to-speech (TTS) speech output, the method comprising: measuring environmental noise using a microphone signal; determining sound characteristics of the measured environmental noise; dynamically predicting expected future sound characteristics of the environmental noise based on the determined sound characteristics of the measured environmental noise, wherein dynamically predicting expected future sound characteristics of the environmental noise comprises characterizing a time-varying pattern of the expected future sound characteristics of the environmental noise based on a time-varying pattern observed in determined sound characteristics of previously occurring environmental noise; receiving a text input at a TTS engine at the device, the TTS engine configured to convert the text input into a speech output signal; determining text characteristics of the text input at the TTS engine; and at the TTS engine, dynamically adapting the speech output signal based on the determined text characteristics of the text input and the predicted expected future sound characteristics of the environmental noise, wherein dynamically adapting the speech output signal comprises varying the pace of the speech output and/or varying the pitch of the speech output.
地址 Tokyo JP