发明名称 SYSTEM AND METHOD FOR APPLYING A CONVOLUTIONAL NEURAL NETWORK TO SPEECH RECOGNITION
摘要 A system and method for applying a convolutional neural network (CNN) to speech recognition. The CNN may provide input to a hidden Markov model and has at least one pair of a convolution layer and a pooling layer. The CNN operates along the frequency axis. The CNN has units that operate upon one or more local frequency bands of an acoustic signal. The CNN mitigates acoustic variation.
申请公布号 US2014288928(A1) 申请公布日期 2014.09.25
申请号 US201313849793 申请日期 2013.03.25
申请人 Penn Gerald Bradley;Jiang Hui;Abdelhamid Ossama Abdelhamid Mohamed;Mohamed Abdel-rahman Samir Abdel-rahman 发明人 Penn Gerald Bradley;Jiang Hui;Abdelhamid Ossama Abdelhamid Mohamed;Mohamed Abdel-rahman Samir Abdel-rahman
分类号 G10L15/16 主分类号 G10L15/16
代理机构 代理人
主权项 1. A method for applying a convolutional neural network to a speech signal to mitigate acoustic variation in speech, the convolutional neural network comprising at least one processor, the method comprising: (a) obtaining an acoustic signal comprising speech; (b) preprocessing the acoustic signal to: (i) transform the acoustic signal to its frequency domain representation; and(ii) divide the frequency domain representation into a plurality of frequency bands; (c) providing the plurality of frequency bands to a convolution layer of the convolutional neural network, the convolution layer comprising a plurality of convolution units each receiving input from at least one of the frequency bands; and (d) providing the output of the convolution layer to a pooling layer of the convolutional neural network, the pooling layer comprising a plurality of pooling units each receiving input from at least one of the convolution units, the output of the pooling layer being a representation of the acoustic signal mitigating acoustic variation.
地址 Scarborough CA