摘要 |
PROBLEM TO BE SOLVED: To perform precise voice quality conversion by easing the burden of speaking on a conversion destination speaker. SOLUTION: A nonlinear frequency axis spectrum matching part 8 finds a frequency warping function regarding the spectrum envelope of a conversion source speaker and the spectrum envelope of the conversion destination speaker. A frequency warp table memory 9 is stored with mean values by 'phonemes', 'similar phonemes', 'voiced/voiceless sound sections', and 'whole voice section' of frequency warping frequencies. For voice quality conversion, the mean frequency warping function is used to convert the spectrum envelope of the conversion source speaker to the spectrum envelope of the conversion destination speaker. Thus, precise voice quality conversion is performed. At this time, when the speaking data amount of the conversion destination speaker is small, mean frequency warping functions of 'similar phonemes', 'voiced/voiceless sound sections', etc., are used to cope with the case, thereby easing the burden of speaking on the conversion destination speaker. |