发明名称 Methods and apparatus for independent processor node operations in a SIMD array processor
摘要 A control processor is used for fetching and distributing single instruction multiple data (SIMD) instructions to a plurality of processing elements (PEs). One of the SIMD instructions is a thread start (Tstart) instruction, which causes the control processor to pause its instruction fetching. A local PE instruction memory (PE Imem) is associated with each PE and contains local PE instructions for execution on the local PE. Local PE Imem fetch, decode, and execute logic are associated with each PE. Instruction path selection logic in each PE is used to select between control processor distributed instructions and local PE instructions fetched from the local PE Imem. Each PE is also initialized to receive control processor distributed instructions. In addition, local hold generation logic is associated with each PE. A PE receiving a Tstart instruction causes the instruction path selection logic to switch to fetch local PE Imem instructions.
申请公布号 US9063722(B2) 申请公布日期 2015.06.23
申请号 US201113332482 申请日期 2011.12.21
申请人 Altera Corporation 发明人 Pechanek Gerald George;Barry Edwin Franklin;Stojancic Mihailo
分类号 G06F9/38;G06F9/30;G06F9/32;G06F15/80 主分类号 G06F9/38
代理机构 Law Offices of Peter H. Priest, PLLC 代理人 Law Offices of Peter H. Priest, PLLC
主权项 1. A method for executing very long instruction words (VLIWs) separately on individual processing elements (PEs), the method comprising: receiving a thread start (Tstart) instruction from a first instruction path in each PE of a plurality of PEs; switching in each PE from the first instruction path to a second instruction path in response to the Tstart instruction, wherein the first instruction path is used to receive single instruction multiple data (SIMD) instructions distributed to each PE and the second instruction path is used to receive instructions from a local PE instruction memory (PE Imem); fetching a PE execute VLIW (PEXV) instruction from the local PE Imem in each PE; selecting a VLIW having a plurality of slot instruction from a VLIW memory located in each PE in response to the PEXV instruction, to decode and execute the plurality of slot instructions in parallel in each PE; executing a Tstop instruction fetched from the local PE Imem that is specified to execute a PE instruction in the shadow of the Tstop instruction; and executing the PE instruction in the shadow of the Tstop instruction.
地址 San Jose CA US