发明名称 |
COMPRESSED RECURRENT NEURAL NETWORK MODELS |
摘要 |
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing long-short term memory layers with compressed gating functions. One of the systems includes a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix. The gate parameter matrix for at least one of the plurality of gates is a structured matrix or is defined by a compressed parameter matrix and a projection matrix. |
申请公布号 |
US2017076196(A1) |
申请公布日期 |
2017.03.16 |
申请号 |
US201615172457 |
申请日期 |
2016.06.03 |
申请人 |
Google Inc. |
发明人 |
Sainath Tara N.;Sindhwani Vikas |
分类号 |
G06N3/04;G06N3/08 |
主分类号 |
G06N3/04 |
代理机构 |
|
代理人 |
|
主权项 |
1. A system comprising:
a recurrent neural network implemented by one or more computers, wherein the recurrent neural network is configured to receive a respective neural network input at each of a plurality of time steps and to generate a respective neural network output at each of the plurality of time steps, and wherein the recurrent neural network comprises:
a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix, and wherein the gate parameter matrix for at least one of the plurality of gates is a Toeplitz-like structured matrix. |
地址 |
Mountain View CA US |