发明名称 Methods and systems for training dictation-based speech-to-text systems using recorded samples
摘要 A method and apparatus useful to train speech recognition engines is provided. Many of today's speech recognition engines require training to particular individuals to accurately convert speech to text. The training requires the use of significant resources for certain applications. To alleviate the resources, a trainer is provided with the text transcription and the audio file. The trainer updates the text based on the audio file. The changes are provided to the speech recognition to train the recognition engine and update the user profile. In certain aspects, the training is reversible as it is possible to over train the system such that the trained system is actually less proficient.
申请公布号 US8744848(B2) 申请公布日期 2014.06.03
申请号 US201113091509 申请日期 2011.04.21
申请人 NVQQ Incorporated 发明人 Hoepfinger Jeffrey;Mondragon David
分类号 G10L15/00 主分类号 G10L15/00
代理机构 代理人
主权项 1. A method for providing data to train a speech recognition engine performed on at least one processor, the method comprising: displaying to a trainer text from an audio file that has been transcribed by a speech recognition engine that is adapted to be trained and uses a user profile; playing to the trainer the audio file that was used by the speech recognition engine to generate the text being displayed; correcting the text on the display by the trainer based on discrepancies between the audio being played and the text being displayed; and transmitting the corrections to the speech to text engine to train the speech recognition engine to update the user profile, training the speech recognition engine based on the corrections, wherein training the speech recognition engine comprises: identifying a user profile of a user to be trained; designating the user profile as the initial user profile; saving the initial user profile; training the initial user profile as the interim user profile; using the interim user profile to transcribe an audio file of the user; determining a performance metric of the speech recognition engine using the interim user profile; comparing the performance metric using the interim user profile to a performance metric using the initial user profile; if the performance metric using the interim user profile is better than the performance metric using the initial user profile, replace the user profile with the interim user profile; and if the performance metric using the interim user profile is worse than the performance metric using the initial user profile, discard the interim user profile.
地址 Boulder CO US
您可能感兴趣的专利