摘要 |
PURPOSE:To attain the high performance of recognition for word voice and text voice, etc., by performing the recognition considering the nature in static and dynamic time frequency fashion of voice. CONSTITUTION:A static phoneme feature vector extraction part 3 extracts a static phoneme feature vector for collation with a static phoneme dictionary 7 sequentially from the time series of a feature parameter in which the voice is analyzed and outputted from a digital signal at a voice input analysis part 1, and collates it with the dynamic phoneme dictionary 7 for category at a static phoneme collation part 6. Similarly, a dynamic phoneme feature vector is collated with the dynamic phoneme dictionary 5 for category at a dynamic phoneme collation part 4. Likelihood obtained by collation is outputted to a word collation part 8, and word collation is performed. A collation result is decided by and outputted from a decision output part 10. Thereby, it is possible to perform phoneme collation with high performance fully using the nature in time frequency fashion of the voice. |