摘要 |
A speech recognition apparatus and the method thereof for correctly recognizing an English word from a non-native English pronunciation, for example. A vector data generating part and a label generating part processes speech data of a sentence of English speech pronounced by a Japanese speaker to convert it to a label string. A candidate word generating part correlates the label string of the sentence to a first candidate word comprising one or more English words. An analogous word adding part uses a word database to search an English word analogous the pronunciation of the first candidate word, such as a analogous word "lead" for a first candidate word "read", for example, (it is difficult for a Japanese speaker to discriminate between "l" and "r" in pronunciation), and adds the obtained analogous word to the first candidate word to make it be a second candidate word. A selection part selects one of the second candidate words as a final result of recognition in response to users operation and connects the selected words into English text data for output.
|