摘要 |
PURPOSE: To predict the presence of noise mixture and to select a synthetic unit for obtaining a synthetic voice of fixed quality or above in a disciplined synthesizer using a phoneme waveform piece. CONSTITUTION: Plural candidates of the synthesis unit corresponding to an input text are selected using a prescribed evaluation reference in a primary selection part 42, and then, the phoneme piece (synthesis unit) is selected from plural candidates in a secondary selection part 43 so that distortion of phoneme piece connection becomes smallest. That is, phoneme selection of two stages are performed. In the selection of the synthesis unit of a second stage, a cepstrum distance between phoneme pieces is made a connection cost, and selecting a path so that the connection cost between phoneme pieces becomes minimum. A synthetic quality decision part 44 decides whether or not fixed quality is obtained at a synthesis time, and when not, the part 44 gains alternate phoneme piece processing from an alternate phoneme piece generation part 45 to send it to a synthesis phoneme piece extraction part 46. A synthesis phoneme piece extraction part 46 extracts the real data (PCM data) of the relevant phoneme piece from a phoneme piece dictionary 5 to output it to a phoneme piece deformation connection part of a poststage. |