摘要 |
PROBLEM TO BE SOLVED: To recognize the speech of an unspecified speaker with high precision in consideration of even the environmental dependency of a phoneme by calculating the probability of a pair of a feature vector time series of an unknown input speech and a phoneme symbol sequence of recognition object word hypotheses with an Ergodic hidden Markov model(HMM), and outputting the word with maximum probability among the words to be recognized as a recognition result. SOLUTION: A transition probability storage part 20 stores transition probability attached to N×N mutual state transitions between N sequenced states. An output probability storage part 30 stores the output probability of phoneme symbols attached to the respective state transitions and the output probability of feature vectors. Then a word matching part 40 uses only one Ergodic HMM which outputs a phoneme symbol sequence and a feature vector sequence to calculate the probability of the pair of the feature vector time series of the unknown input speech and the phoneme symbol sequence of recognition object word hypotheses. Then a recognition result output part 50 outputs the word having maximum probability among all recognition object words as the recognition result.
|