发明名称 EFFICIENT GENERATION OF COMPLEMENTARY ACOUSTIC MODELS FOR PERFORMING AUTOMATIC SPEECH RECOGNITION SYSTEM COMBINATION
摘要 Systems and processes for generating complementary acoustic models for performing automatic speech recognition system combination are provided. In one example process, a deep neural network can be trained using a set of training data. The trained deep neural network can be a deep neural network acoustic model. A Gaussian-mixture model can be linked to a hidden layer of the trained deep neural network such that any feature vector outputted from the hidden layer is received by the Gaussian-mixture model. The Gaussian-mixture model can be trained via a first portion of the trained deep neural network and using the set of training data. The first portion of the trained deep neural network can include an input layer of the deep neural network and the hidden layer. The first portion of the trained deep neural network and the trained Gaussian-mixture model can be a Deep Neural Network-Gaussian-Mixture Model (DNN-GMM) acoustic model.
申请公布号 US2016034811(A1) 申请公布日期 2016.02.04
申请号 US201414503028 申请日期 2014.09.30
申请人 Apple Inc. 发明人 PAULIK Matthias;KRISHNAMOORTHY Mahesh
分类号 G06N3/08;G06N3/04 主分类号 G06N3/08
代理机构 代理人
主权项 1. A method for generating complementary acoustic models for performing automatic speech recognition system combination, the method comprising: at a device with a processor and memory storing instructions for execution by the processor: training a deep neural network using a set of training data, wherein the deep neural network comprises an input layer, an output layer, and a plurality of hidden layers disposed between the input layer and the output layer, wherein training the deep neural network comprises: determining, using the set of training data, a set of optimal weighting values of the deep neural network; andstoring the set of optimal weighting values in the memory;linking a Gaussian-mixture model to a hidden layer of the trained deep neural network such that any feature vector outputted from the hidden layer is received by the Gaussian-mixture model; andtraining the Gaussian-mixture model via a first portion of the trained deep neural network and using the set of training data, wherein the first portion of the trained deep neural network includes the input layer and the hidden layer, and wherein training the Gaussian-mixture model comprises: determining, using the set of training data, a set of optimal parameter values of the Gaussian-mixture model; andstoring the set of optimal parameter values in the memory.
地址 Cupertino CA US