摘要 |
In one embodiment, the present invention includes a method for constructing a data dependency graph (DDG) for a loop to be transformed, performing statement shifting to transform the loop into a first transformed loop according to at least one of first and second algorithms, performing unimodular and echelon transformations of a selected one of the first or second transformed loops, partitioning the selected transformed loop to obtain maximum outer level parallelism (MOLP), and partitioning the selected transformed loop into multiple sub-loops. Other embodiments are described and claimed. |