摘要 |
Voice feature data extracted from an input voice signal are stored in a first memory. Of the stored voice feature data, feature data having a predetermined duration defined by the output from a word boundary detection section are read out by a re-sampling section, and are stored in a second memory. The voice feature data which are normalized along the time base in this manner are supplied to a similarity computing section together with reference pattern data, and a category pattern corresponding to the resultant maximum similarity is determined in a determining section and outputted as a recognition result of the input voice.
|