摘要 |
Clustered VLIW processing elements, each preferably simple and identical, are coupled by a runtime reconfigurable inter-cluster interconnect to form a coprocessor executing only those portions of a program having high instruction level parallelism. The initial portion of each program segment executed by the coprocessor reconfigures the interconnect, if necessary, or is skipped. Clusters may be directly connected to a subset of neighboring clusters, or indirectly connected to any other cluster, a hierarchy exposed to the programming model and enabling a larger number of clusters to be employed. The coprocessor is idled during remaining portions of the program to reduce power dissipation.
|