摘要 |
A speech processing apparatus according to an embodiment of the invention includes a conversion-source-speaker speech-unit database; a voice-conversion-rule-learning-data generating means; and a voice-conversion-rule learning means, with which it makes voice conversion rules. The voice-conversion-rule-learning-data generating means includes a conversion-target-speaker speech-unit extracting means; an attribute-information generating means; a conversion-source-speaker speech-unit database; and a conversion-source-speaker speech-unit selection means. The conversion-source-speaker speech-unit selection means selects conversion-source-speaker speech units corresponding to conversion-target-speaker speech units based on the mismatch between the attribute information of the conversion-target-speaker speech units and that of the conversion-source-speaker speech units, whereby the voice conversion rules are made from the selected pair of the conversion-target-speaker speech units and the conversion-source-speaker speech units.
|