发明名称 Apparatus for speech recognition using multiple acoustic model and method thereof
摘要 Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.
申请公布号 US9378742(B2) 申请公布日期 2016.06.28
申请号 US201313845941 申请日期 2013.03.18
申请人 Electronics and Telecommunications Research Institute 发明人 Kim Dong Hyun
分类号 G10L15/32;G10L15/065 主分类号 G10L15/32
代理机构 Nelson Mullins Riley & Scarborough LLP 代理人 Nelson Mullins Riley & Scarborough LLP ;Laurentano Anthony A.
主权项 1. An apparatus for recognizing voice using multiple acoustic models, the apparatus comprising: a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate a Gaussian mixture model (GMM) acoustic model for calculating a similarity and a hidden Markov model (HMM) acoustic model for recognizing voice of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated HMM acoustic model using the generated GMM acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result, wherein the voice recognizing means calculates a similarity between the extracted feature data and the generated GMM acoustic model and HMM acoustic models, and repeats a process of selecting a model until final N models are obtained in a descending order of the similarity.
地址 Daejeon KR