发明名称 Discriminative language modeling for automatic speech recognition with a weak acoustic model and distributed training
摘要 Training data from a plurality of utterance-to-text-string mappings of an automatic speech recognition (ASR) system may be selected. Parameters of the ASR system that characterize the utterances and their respective mappings may be determined through application of a first acoustic model and a language model. A second acoustic model and the language model may be applied to the selected training data utterances to determine a second set of utterance-to-text-string mappings. The first set of utterance-to-text-string mappings may be compared to the second set of utterance-to-text-string mappings, and the parameters of the ASR system may be updated based on the comparison.
申请公布号 US8965763(B1) 申请公布日期 2015.02.24
申请号 US201213461093 申请日期 2012.05.01
申请人 Google Inc. 发明人 Chelba Ciprian Ioan;Strope Brian;Jyothi Preethi;Johnson Leif
分类号 G10L15/26;G10L15/32;G10L15/06 主分类号 G10L15/26
代理机构 McDonnell Boehnen Hulbert & Berghoff LLP 代理人 McDonnell Boehnen Hulbert & Berghoff LLP
主权项 1. A method comprising: determining, by a computing system, a reference transcription of a reference utterance, wherein the reference transcription is derived using a strong acoustic model, a language model and a weight vector, and wherein the reference transcription has a confidence level of at least 70%; based on the reference transcription having the confidence level of at least 70%, determining a secondary transcription of the reference utterance, wherein the secondary transcription is derived using a weak acoustic model, the language model and the weight vector, wherein the secondary transcription has a secondary confidence level, wherein the weak acoustic model has a higher error rate than the strong acoustic model, and wherein the secondary transcription is different from the reference transcription; and based on the secondary transcription being different from the reference transcription, updating the weight vector so that transcribing the reference utterance using the weak acoustic model, the language model and the updated weight vector results in a tertiary transcription with a tertiary confidence level that is greater than the secondary confidence level.
地址 Mountain View CA US