发明名称 Speech recognition method and electronic apparatus
摘要 A speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. A plurality of phonetic transcriptions of a speech signal is obtained according to an acoustic model. A phonetic spelling and intonation information matched to the phonetic transcriptions are obtained according to a phonetic transcription sequence and a syllable acoustic lexicon of the invention. According to the phonetic spellings and the intonation information, a plurality of phonetic spelling sequences and a plurality of phonetic spelling sequence probabilities are obtained from a language model. The phonetic spelling sequence corresponding to a largest one among the phonetic spelling sequence probabilities is selected as a recognition result of the speech signal.
申请公布号 US9613621(B2) 申请公布日期 2017.04.04
申请号 US201414490677 申请日期 2014.09.19
申请人 VIA Technologies, Inc. 发明人 Zhang Guo-Feng;Zhu Yi-Fei
分类号 G10L15/187;G10L25/33;G10L15/06;G10L13/08 主分类号 G10L15/187
代理机构 Jianq Chyun IP Office 代理人 Jianq Chyun IP Office
主权项 1. A speech recognition method, adapted to an electronic apparatus, comprising: obtaining a phonetic transcription sequence of a speech signal according to an acoustic model; obtaining a plurality of possible syllable sequences and a plurality of corresponding phonetic spelling matching probabilities according to the phonetic transcription sequence and a syllable acoustic lexicon; obtaining an intonation information corresponding to each of the syllable sequences according to a tone of the phonetic transcription sequence; obtaining a plurality of phonetic spelling sequences and a plurality of phonetic spelling sequence probabilities, from the language model, according to each phonetic spelling of phonetic spelling sequences and the intonation information; obtaining, from the language model, a plurality of text sequences corresponding to the phonetic transcription sequence, and a plurality of spelling sequence probabilities; generating a plurality of associated probabilities by multiplying each of the phonetic spelling matching probabilities and each of the spelling sequence probabilities; and selecting the text sequence corresponding to a largest one among the associated probabilities to be used as a recognition result of the speech signal, wherein different intonation information in the language model is divided into different semantemes, and the semantemes are corresponding to different phonetic spelling sequences.
地址 New Taipei TW