摘要 |
PROBLEM TO BE SOLVED: To provide a device and a method for speech recognition that can reduce processing quantities of noise adaptation and speaker adaptation. SOLUTION: A representative speech model C and a difference model D obtained by clustering an initial speech model and so on are previously stored in a representative speech model storage part 1a and a difference model storage part 1b. Before speech recognition is performed, a noise adapted representative speech model C<SP>N</SP>is generated by performing noise adaptation for the representative speech model C and the difference model D is synthesized to generate a composite speech model M subjected to noise adaptation. Speaker adaptation to the composite speech model V(n) is performed with a feature vector series V(n) of a spoken speech to generate a noise and speaker adapted speech model R. An updated difference model D" is generated from the difference between the noise and speaker adapted speech model R and noise adapted representative speech model C<SP>N</SP>and the difference model D in the storage part 1b is updated with the updated difference model D". For speech recognition, the representative speech model C and updated difference model D" are put together to generate a synthesized speech model M" after noise adaptation and speaker adaptation and a feature vector series V(n) of the speaker speech to be recognized is collated to perform the speech recognition. COPYRIGHT: (C)2004,JPO
|