摘要 |
Mechanisms are provided for performing a matrix operation. A processor of a data processing system is configured to perform cluster-based matrix reordering of an input matrix. An input matrix, which comprises nodes associated with elements of the matrix, is received. The nodes are clustered into clusters based on numbers of connections with other nodes within and between the clusters, and the clusters are ordered by minimizing a total length of cross cluster connections between nodes of the clusters, to thereby generate a reordered matrix. A lookup table is generated identifying new locations of nodes of the input matrix, in the reordered matrix. A matrix operation is then performed based on the reordered matrix and the lookup table. |
主权项 |
1. A method, in a data processing system comprising a processor and a memory, for performing a matrix operation, the method comprising:
configuring the processor of the data processing system to perform cluster-based matrix reordering of an input matrix; receiving, by the processor, the input matrix, wherein the input matrix comprises nodes associated with elements of the matrix; clustering, by the processor, the nodes into clusters based on numbers of connections with other nodes within and between the clusters; ordering, by the processor, the clusters by minimizing a total length of cross cluster connections between nodes of the clusters, to thereby generate a reordered matrix; generating, by the processor, a lookup table identifying new locations of nodes of the input matrix, in the reordered matrix; storing, in a memory of the data processing system, data corresponding to the nodes in accordance with the new locations of nodes in the reordered matrix; and performing, by the processor, a matrix operation based on the reordered matrix and the lookup table at least by loading data corresponding to nodes in the reordered matrix into a cache memory of the data processing system, wherein the storage of the data corresponding to the nodes in accordance with the new locations of nodes in the reordered matrix minimizes cache misses in the cache memory when performing the matrix operation. |