摘要 |
An architecture for a central processing unit (cpu) provides for the extraction of low-level concurrency from sequential instruction streams. The cpu includes an instruction queue, a plurality of processing elements, a sink storage matrix for temporary storage of data elements, and relational matrixes storing dependencies between instructions in the queue. An execution matrix stores the dynamic execution state of the instructions in the queue. An executable independence calculator determines which instructions are eligible for execution and the location of source data elements. New techniques are disclosed for determining data independence of instructions, for branch prediction without state restoration or backtracking, and for the decoupling of instruction execution from memory updating. |