Multithreaded processor with multiple concurrent pipelines per thread,申请号US201113282800-传众专利搜索

发明名称	Multithreaded processor with multiple concurrent pipelines per thread
摘要	A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.
申请公布号	US8762688(B2)	申请公布日期	2014.06.24
申请号	US201113282800	申请日期	2011.10.27
申请人	QUALCOMM Incorporated	发明人	Hokenek Erdem;Moudgill Mayan;Schulte Michael J.;Glossner C. John
分类号	G06F9/00	主分类号	G06F9/00
代理机构	Knobbe Martens Olson & Bear LLP	代理人	Knobbe Martens Olson & Bear LLP
主权项	1. A multithreaded processor comprising: means for permitting a thread to issue one or more instructions on a processor clock cycle; means for varying the thread permitted to issue instructions over a plurality of clock cycles in accordance with an instruction issuance sequence; and means for pipelining the instructions to permit the threads to support multiple concurrent instruction pipelines, wherein the pipelined instructions comprise at least a vector multiplication and reduction instruction that includes an instruction decode stage, a vector register file read stage, at least two multiply stages, at least two add stages, an accumulator read stage, a plurality of reduction stages, and an accumulator writeback stage; wherein the vector multiplication and reduction instruction is pipelined using a number of stages which is greater than a total number of threads of the processor; and wherein vector multiplication and reduction instruction pipelines are shifted relative to one another to permit computation cycles which are longer than issue cycles without forwarding logic to allow lengthening of execution phases without causing bubbles in the pipelines.
地址	San Diego CA US