主权项 |
1. An apparatus, comprising:
an output buffer that holds N words arranged as N/J mutually exclusive output buffer word groups of J words each of the N words, J is greater than 2 and N is at least twice J; an array of N processing units (PU) arranged as N/J mutually exclusive PU groups of J PUs each of the N PUs, each PU group of the N/J PU groups has an associated output buffer word group of the N/J output buffer word groups, each PU having:
first and second multiplexed registers each having:
at least J+1 inputs, a first input of the J+1 inputs receives an operand from a memory and the other J inputs receive the J words of the associated output buffer word group;an output; anda control input that controls selection of the J+1 inputs for provision on the output;an accumulator having an output for provision to a respective one of the N output buffer words; andan arithmetic unit having first and second inputs to receive the output of the first and second multiplexed registers, respectively, and a third input that receives the accumulator output, the arithmetic unit performs an operation on the first, second and third inputs to generate a result for accumulation into the accumulator; the output buffer includes a mask input that controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output; and each PU group of the N/J PU groups of J PUs operates as a Long Short Term Memory (LSTM) cell of a recurrent neural network, a first of the J PUs computes an input gate, a second of the J PUs computes a forget gate, and a third of the J PUs computes an output gate of the LSTM cell. |