摘要 |
A plurality of processor elements (PEs) are connected in a duster by a common instruction bus to a sequencing control unit with its associated instruction memory. Each PE has data buses connected to at least its four nearest PE neighbors, referred to as its North, South, East and West PE neighbors. Each PE also has a general purpose register file containing several operand registers. A common instruction is fetched from the instruction memory by the sequencing control unit and broadcast over the instruction bus to each PE in the cluster. The instruction includes an upcode value that controls the arithmetic or logical operation performed by an execution unit in the PE on one or more operands in the register file. A switch is included in each PE to interconnect it with a first PE neighbor as the destination to which the result from the execution unit is sent. The broadcast instruction includes a destination field that controls the switch in the PE, to dynamically select the destination neighbor PE to which the result is sent. Further, the broadcast instruction includes a target field that controls the switch in the PE, to dynamically select the operand register in the register file of the PE, to which another result received from another neighbor PE in the cluster is stored. In this manner, the instruction broadcast to all the PEs in the cluster, dynamically controls the communication of operands and results between the PEs in the cluster, in a single instruction, multiple data processor array.
|