摘要 |
An array processor is described with N processing elements, N memory modules, and an interconnection network that allows parallel access and alignment of rows, columns, diagonals, contiguous blocks, and distributed blocks of N x N arrays. The memory system of the array processor uses the minimum number of memory modules to achieve conflict-free memory access, and computes N addresses with O(log2N) logic gates in O(1) times. Furthermore, the interconnection network is of multistage design with O(Nlog2N) logic gates, and is able to align any of these vectors of data for store/fetch as well as for subsequent processing with a single pass through the network. |