发明名称 |
Efficient transfer of matrices for matrix based operations |
摘要 |
Techniques for transferring a matrix for performing one or more operations are provided. The techniques include applying a permutation on at least one of one or more columns and one or more rows of a matrix to group each of at least one of one or more columns and one or more rows of the matrix with a same alignment, blocking at least one of the grouped columns and grouped rows, and performing one or more operations on each matrix block. |
申请公布号 |
US9058301(B2) |
申请公布日期 |
2015.06.16 |
申请号 |
US200912485365 |
申请日期 |
2009.06.16 |
申请人 |
International Business Machines Corporation |
发明人 |
Agrawal Prashant;Sabharwal Yogish;Saxena Vaibhav |
分类号 |
G06F12/00;G06F13/00;G06F13/28;G06F17/16;G06F7/76 |
主分类号 |
G06F12/00 |
代理机构 |
Ryan, Mason & Lewis, LLP |
代理人 |
Ryan, Mason & Lewis, LLP |
主权项 |
1. A method for transferring a matrix for performing one or more operations, wherein the method comprises:
applying a permutation on three or more columns and/or three or more rows of a matrix stored in column and/or row major ordering to associate the three or more columns and/or the three or more rows of the matrix with a same alignment of a starting memory address, wherein said applying is carried out by a permutation module executing on a hardware processor; blocking two or more non-consecutive columns and/or two or more non-consecutive rows from the associated three or more columns and/or the associated three or more rows into at least one matrix block of non-contiguous data, wherein the starting memory address of each matrix block is aligned and the size of each matrix block is a multiple of a cache line size, and wherein said blocking is carried out by a permutation module executing on a hardware processor; and performing one or more operations on each matrix block, wherein said performing is carried out by a matrix operation module. |
地址 |
Armonk NY US |