发明名称 Processor pipeline which implements fused and unfused multiply-add instructions
摘要 Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
申请公布号 US8977670(B2) 申请公布日期 2015.03.10
申请号 US201213469212 申请日期 2012.05.11
申请人 Oracle International Corporation 发明人 Brooks Jeffrey S.;Olson Christopher H.
分类号 G06F7/38;G06F7/483;G06F7/544 主分类号 G06F7/38
代理机构 Meyertons Hood Kivlin Kowert & Goetzel 代理人 Meyertons Hood Kivlin Kowert & Goetzel
主权项 1. A system for implementing an unfused multiply-add instruction within a fused multiply-add pipeline, comprising: a multiplier tree having two inputs for receiving a first value and a second value for multiplication, wherein the multiplier tree is configured to produce a first partial product and a second partial product; a first carry save adder (CSA), wherein the first CSA is configured to receive the first partial product, the second partial product, and an aligned addition term, wherein the first CSA is configured to produce first and second CSA terms; and a fused/unfused multiply add (FUMA) block configured to perform one of an unfused multiply add operation and a fused multiply add operation, wherein the FUMA block includes: a second CSA configured to receive the first partial product, the second partial product, and the aligned addition term and to produce a multiply add intermediate result including third and fourth CSA terms, wherein the first partial product and the second partial product are not truncated;a half adder coupled to the second CSA and configured to produce first and second half-adder terms from the third and fourth CSA terms;a first carry propagate adder (CPA) coupled to the half adder and configured to provide a first CPA sum from the first and second half-adder terms; anda second CPA coupled to the second CSA and configured to provide a second CPA sum from the third and fourth CSA terms.
地址 Redwood Shores CA US