发明名称 | Utterance selection for automated speech recognizer training | ||
摘要 | Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a set of training utterances. The methods, systems, and apparatus include actions of obtaining a target multi-dimensional distribution of characteristics in an initial set of candidate utterances and selecting a subset of the initial set of candidate utterances based on speech recognition confidence scores associated with the candidate utterances. Additional actions include selecting a particular candidate utterance from the subset of the initial set of utterances and determining that adding the particular candidate utterance to a set of training utterances reduces a divergence of a multi-dimensional distribution of the characteristics in the set of training utterances from the target multi-dimensional distribution. Further actions include adding the particular candidate utterance to the set of training utterances. | ||
申请公布号 | US9263033(B2) | 申请公布日期 | 2016.02.16 |
申请号 | US201414314295 | 申请日期 | 2014.06.25 |
申请人 | Google Inc. | 发明人 | Siohan Olivier;Mengibar Pedro J. |
分类号 | G10L15/00;G10L15/06 | 主分类号 | G10L15/00 |
代理机构 | Fish & Richardson P.C. | 代理人 | Fish & Richardson P.C. |
主权项 | 1. A computer-implemented method comprising: obtaining a target multi-dimensional distribution of characteristics in an initial set of candidate utterances; selecting a subset of the initial set of candidate utterances based on speech recognition confidence scores associated with the candidate utterances; selecting a particular candidate utterance from the subset of the initial set of utterances; determining that adding the particular candidate utterance to a set of training utterances reduces a divergence of a multi-dimensional distribution of the characteristics in the set of training utterances from the target multi-dimensional distribution; and adding the particular candidate utterance to the set of training utterances. | ||
地址 | Mountain View CA US |