发明名称 |
FAST DEEP NEURAL NETWORK FEATURE TRANSFORMATION VIA OPTIMIZED MEMORY BANDWIDTH UTILIZATION |
摘要 |
Deep Neural Networks (DNNs) with many hidden layers and many units per layer are very flexible models with a very large number of parameters. As such, DNNs are challenging to optimize. To achieve real-time computation, embodiments disclosed herein enable fast DNN feature transformation via optimized memory bandwidth utilization. To optimize memory bandwidth utilization, a rate of accessing memory may be reduced based on a batch setting. A memory, corresponding to a selected given output neuron of a current layer of the DNN, may be updated with an incremental output value computed for the selected given output neuron as a function of input values of a selected few non-zero input neurons of a previous layer of the DNN in combination with weights between the selected few non-zero input neurons and the selected given output neuron, wherein a number of the selected few corresponds to the batch setting. |
申请公布号 |
US2016322042(A1) |
申请公布日期 |
2016.11.03 |
申请号 |
US201514699778 |
申请日期 |
2015.04.29 |
申请人 |
Nuance Communications, Inc. |
发明人 |
Vlietinck Jan;Kanthak Stephan;Vuerinckx Rudi;Ris Christophe |
分类号 |
G10L15/06;G06N3/08 |
主分类号 |
G10L15/06 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method for improving computation time of speech recognition processing in an electronic device, the method comprising:
by a processor: updating a memory, corresponding to a selected given output neuron of a current layer of a Deep Neural Network (DNN), with an incremental output value computed for the selected given output neuron as a function of input values of a selected few non-zero input neurons of a previous layer of the DNN in combination with weights between the selected few non-zero input neurons and the selected given output neuron, wherein a number of the selected few corresponds to a batch setting; iterating the updating for each output neuron of the current layer; and repeating the updating and the iterating for each next selected few non-zero input neurons of the previous layer to reduce a rate of accessing the memory based on the batch setting to improve the computation time of the speech recognition processing. |
地址 |
Burlington MA US |