发明名称 FAST DEEP NEURAL NETWORK FEATURE TRANSFORMATION VIA OPTIMIZED MEMORY BANDWIDTH UTILIZATION
摘要 Deep Neural Networks (DNNs) with many hidden layers and many units per layer are very flexible models with a very large number of parameters. As such, DNNs are challenging to optimize. To achieve real-time computation, embodiments disclosed herein enable fast DNN feature transformation via optimized memory bandwidth utilization. To optimize memory bandwidth utilization, a rate of accessing memory may be reduced based on a batch setting. A memory, corresponding to a selected given output neuron of a current layer of the DNN, may be updated with an incremental output value computed for the selected given output neuron as a function of input values of a selected few non-zero input neurons of a previous layer of the DNN in combination with weights between the selected few non-zero input neurons and the selected given output neuron, wherein a number of the selected few corresponds to the batch setting.
申请公布号 US2016322042(A1) 申请公布日期 2016.11.03
申请号 US201514699778 申请日期 2015.04.29
申请人 Nuance Communications, Inc. 发明人 Vlietinck Jan;Kanthak Stephan;Vuerinckx Rudi;Ris Christophe
分类号 G10L15/06;G06N3/08 主分类号 G10L15/06
代理机构 代理人
主权项 1. A method for improving computation time of speech recognition processing in an electronic device, the method comprising: by a processor: updating a memory, corresponding to a selected given output neuron of a current layer of a Deep Neural Network (DNN), with an incremental output value computed for the selected given output neuron as a function of input values of a selected few non-zero input neurons of a previous layer of the DNN in combination with weights between the selected few non-zero input neurons and the selected given output neuron, wherein a number of the selected few corresponds to a batch setting; iterating the updating for each output neuron of the current layer; and repeating the updating and the iterating for each next selected few non-zero input neurons of the previous layer to reduce a rate of accessing the memory based on the batch setting to improve the computation time of the speech recognition processing.
地址 Burlington MA US