摘要 |
A computer program having threads and data is assigned to a processor having a processor cores and memory organized over hardware locality groups. The computer program is profiled to generate a data thread interaction graph (DTIG) representing the computer program. The threads and the data of the computer program are organized over clusters using the DTIG and based on one or more constraints. The DTIG is displayed to a user, and the user is permitted to modify the constraints such that the threads and the data of the computer program are reorganized over the clusters. Each cluster is mapped onto one of the hardware locality groups. The computer program is regenerated based on the mappings of clusters to hardware locality groups. At run-time, optimizations are performed to improve execution performance, while the computer program is executed.
|