发明名称 Training acoustic models using distributed computing techniques
摘要 Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models. Speech data and data identifying a transcription for the speech data are received. A phonetic representation for the transcription is accessed. Training sequences are identified for a particular phone in the phonetic representation. Each of the training sequences includes a different set of contextual phones surrounding the particular phone. A partitioning key is identified based on a sequence of phones that occurs in each of the training sequences. A processing module to which the identified partitioning key is assigned is selected. Data identifying the training sequences and a portion of the speech data are transmitted to the selected processing module.
申请公布号 US8959014(B2) 申请公布日期 2015.02.17
申请号 US201213539225 申请日期 2012.06.29
申请人 Google Inc. 发明人 Xu Peng;Pereira Fernando;Chelba Ciprian I.
分类号 G06F17/21;G06F17/20;G10L15/08;G10L15/14;G10L15/187;G10L15/04;G10L15/34 主分类号 G06F17/21
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving speech data and data identifying a transcription for the speech data;accessing a phonetic representation for the transcription;extracting training sequences from the phonetic representation for a particular phone in the phonetic representation, the training sequences comprising two or more training sequences that include (i) a particular sequence of multiple phones and (ii) a different number of contextual phones surrounding the particular phone;identifying a partitioning key for the training sequences based on the particular sequence of multiple phones that occurs in the two or more training sequences;selecting, from among a plurality of processing modules, a processing module to which the identified partitioning key is assigned, the processing module being designated to train a portion of an acoustic model that corresponds to the identified partitioning key; andtransmitting, to the selected processing module, (i) data identifying the training sequences and (ii) a portion of the speech data that corresponds to the training sequence that includes the most contextual phones.
地址 Mountain View CA US