发明名称 Reducing dynamic range of low-rank decomposition matrices
摘要 Features are disclosed for reducing the dynamic range of an approximated trained artificial neural network weight matrix in an automatic speech recognition system. The weight matrix may be approximated as two low-rank matrices using a decomposition technique. This approximation technique may insert an additional layer between the two original layers connected by the weight matrix. The dynamic range of the low-rank decomposition may be reduced by applying the square root of singular values, combining them with both low-rank matrices, and utilizing a random rotation matrix to further compress the low-rank matrices. Reduction of dynamic range may make fixed point scoring more effective due to smaller quantization error, as well as make the neural network system more favorable for retraining after approximating a neural network weight matrix. Features are also disclosed for adjusting the learning rate during retraining to account for the low-rank approximations.
申请公布号 US9400955(B2) 申请公布日期 2016.07.26
申请号 US201314106633 申请日期 2013.12.13
申请人 Amazon Technologies, Inc. 发明人 Garimella Sri Venkata Surya Siva Rama Krishna
分类号 G06F15/18;G06N3/08;G06N99/00;G10L15/16 主分类号 G06F15/18
代理机构 Knobbe, Martens, Olson & Bear, LLP 代理人 Knobbe, Martens, Olson & Bear, LLP
主权项 1. A computer-implemented method comprising: under control of one or more computing devices configured with specific computer-executable instructions, obtaining data defining an artificial neural network comprising a matrix, the matrix representing weights between a first layer and a second layer of the artificial neural network;decomposing the matrix using singular value decomposition into a first component matrix, a second component matrix, and a third component matrix;approximating the matrix as a product of first and second low-rank matrices using a portion of the first component matrix, a portion of the second component matrix, a portion of the third component matrix, and a rotation matrix, wherein a rank of the first low-rank matrix is less than a rank of the matrix and a rank of the second low-rank matrix is less than the rank of the matrix,wherein the first low-rank matrix corresponds to a product of the portion of the first component matrix, a square root of the portion of the second component matrix, and the rotation matrix, andwherein the second low-rank matrix corresponds to a product of a transpose of the rotation matrix, the square root of a portion of the second component matrix, and a portion of the third component matrix;inserting a third layer between the first and second layers of the artificial neural network wherein the first low-rank matrix represents weights between the first and third layers and the second low-rank matrix represents weights between the third and second layers; andsubsequently, retraining the artificial neural network for use as a speech recognition model.
地址 Seattle WA US