发明名称 Dynamically biasing language models
摘要 Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.
申请公布号 US9502032(B2) 申请公布日期 2016.11.22
申请号 US201414525826 申请日期 2014.10.28
申请人 Google Inc. 发明人 Aleksic Petar;Moreno Mengibar Pedro J.
分类号 G10L15/26;G10L15/22;G10L15/18;G10L19/00 主分类号 G10L15/26
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A method performed by one or more computers, the method comprising: receiving audio data encoding one or more utterances; generating a recognition lattice of the one or more utterances by performing speech recognition on the audio data using a first pass speech recognizer; identifying a specific context for the one or more utterances that is referenced by the recognition lattice of the one or more utterances, generated by performing speech recognition on the audio data using the first pass speech recognizer, based on semantic analysis of the recognition lattice; in response to identifying the specific context that is referenced by the recognition lattice, selecting a second pass speech recognizer that is biased towards the specific context that is referenced by the recognition lattice of the one or more utterances, generated by performing speech recognition on the audio data using the first pass speech recognizer, based on semantic analysis of the recognition lattice; in parallel with generating a first transcription of the one or more utterances using the first pass speech recognizer, generating, by an automatic speech recognition engine, a second transcription of the one or more utterances by performing additional speech recognition on the audio data using the second pass speech recognizer that is biased towards the specific context that is referenced by the recognition lattice that was generated by performing speech recognition on the audio data using the first pass speech recognizer; and providing an output transcription of one of the first transcription of the one or more utterances or the second transcription of the one or more utterances to initiate an operation based on the output transcription.
地址 Mountain View CA US