摘要 |
<p>A new speaker provides speech from which comparison snippets are extracted. The comparison snippets are compared with initial snippets stored in a recorded snippet database that is associated with a concatenative synthesizer. The comparison of the snippets to the initial snippets produces required sound units (Figure 1). A greedy selection algorithm is performed with the required sound units for identifying the smallest subset of the input text that contains all of the text for the new speaker to read. The new speaker then reads the optimally selected text and sound units are extracted from the human speech such that the recorded snippet database is modified and the speech synthesized adopts the voice quality and characteristics of the new speaker.</p> |