摘要 |
An apparatus and method for implementing adjacent, single non-unit stride memory access patterns are described. In one embodiment, the method includes compiler analysis of a source program to detect vectorizable loops having serial code statements that collectively perform adjacent, non-unit stride memory access. Once a vectorizable loop containing code statements that collectively perform adjacent, non-unit stride memory access in detected, the compiler vectorizes the serial code statements of the detected loop to perform the adjacent, non-unit stride memory access utilizing SIMD instructions. As such, the compiler repeats the analysis and vectorization for each vectorizable loop within the source program code.
|