发明名称 Apparatus and method for creating dictionary for speech synthesis utilizing a display to aid in assessing synthesis quality
摘要 Apparatus for creating a dictionary for speech synthesis includes a sentence storage unit configured to store N sentences, a sentence display unit configured to selectively display a first sentence which is one of the N sentences, a recording unit configured to record each user speech, a necessity determination unit configured to make a determination of whether to create the dictionary, a dictionary creation unit configured to create the dictionary by utilizing the user speech, and a speech synthesis unit configured to convert a second sentence to a synthesized speech with the dictionary. The display unit is configured to stop displaying the currently displayed sentence according to an evaluation of a quality of its synthesis. The determination unit makes the determination under a condition that the recording unit records the user speech of M first sentences (M is less than N) and the determination is based on at least one of an instruction from the user, M and an amount of the recorded user speech.
申请公布号 US9129596(B2) 申请公布日期 2015.09.08
申请号 US201213535782 申请日期 2012.06.28
申请人 Kabushiki Kaisha Toshiba 发明人 Tachibana Kentaro;Morita Masahiro;Kagoshima Takehiko
分类号 G06F17/21;G10L13/00;G10L13/02;G10L13/06;G10L25/60 主分类号 G06F17/21
代理机构 Finnegan, Henderson, Farabow, Garrett & Dunner, L.L.P. 代理人 Finnegan, Henderson, Farabow, Garrett & Dunner, L.L.P.
主权项 1. An apparatus for creating a dictionary for speech synthesis, comprising: a sentence storage unit configured to store N sentences where N is a counting number, each sentence being prepared in advance to prompt a user to utter; a sentence display unit configured to selectively display at least one first sentence, each first sentence being one of the N sentences; a recording unit configured to record each user speech corresponding to each first sentence; a necessity determination unit, under a condition that the recording unit records the user speech of M first sentences, M being a counting number less than N, configured to make a determination of whether to create the dictionary based on at least one of an instruction from the user, the counting number M, and an amount of the user speech recorded; a dictionary creation unit configured to create the dictionary by utilizing the user speech and the first sentences corresponding to the user speech when the necessity determining unit makes the determination that the dictionary creation unit needs to create the dictionary; a speech synthesis unit configured to convert a second sentence, which is the same as the displayed at least one first sentence, to a synthesized speech by utilizing the dictionary; and a quality evaluation unit configured to evaluate a sound quality of the synthesized speech, wherein the sentence display unit is configured to stop displaying the currently displayed at least one first sentence when the quality evaluation unit evaluates that the sound quality of the synthesized speech has reached a certain high quality.
地址 Tokyo JP