发明名称 Discovery of problematic pronunciations for automatic speech recognition systems
摘要 Methods, systems, and apparatus, including computer programs encoded on computer storage media, for discovery of problematic pronunciations for automatic speech recognition systems. One of the methods includes determining a frequency of occurrences of one or more n-grams in transcribed text and a frequency of occurrences of the n-grams in typed text and classifying a system pronunciation of a word included in the n-grams as correct or incorrect based on the frequencies. The n-grams may comprise one or more words and at least one of the words is classified as incorrect based on the frequencies. The frequencies of the specific n-grams may be determined across a domain using one or more n-grams that typically appear adjacent to the specific n-grams.
申请公布号 US8959020(B1) 申请公布日期 2015.02.17
申请号 US201313853150 申请日期 2013.03.29
申请人 Google Inc. 发明人 Strope Brian;Beaufays Francoise;Strohman Trevor D.
分类号 G10L15/18;G06F17/24 主分类号 G10L15/18
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A computer-implemented method comprising: receiving, by one or more computers, transcribed data from a speech recognition system, wherein the transcribed data includes one or more first transcribed n-grams; receiving, by at least one of the computers, a corpus of typed text including a plurality of typed n-grams; determining a transcribed frequency for a specific n-gram, the specific n-gram being one of the typed n-grams included in the corpus of typed text, the transcribed frequency being based on a first quantity of occurrences in which the specific n-gram is one of the first transcribed n-grams included in the transcribed data; determining a typed frequency for the specific n-gram, the typed frequency being based on a second quantity of occurrences in which the specific n-gram is one of the typed n-grams included in the corpus of typed text; comparing the transcribed frequency for the specific n-gram with the typed frequency for the specific n-gram; and classifying, based on the comparing, a system pronunciation associated with the specific n-gram as occurring frequently in a plurality of spoken phrases included in the transcribed data or occurring infrequently in the plurality of spoken phrases, wherein the system pronunciation is for use by the speech recognition system in transcribing future utterances.
地址 Mountain View CA US