发明名称 |
Apparatus and method for performing fused multiply add floating point operation |
摘要 |
A fused multiply add floating point unit 1 includes multiplying circuitry 4 and adding circuitry 8. The multiply circuitry 4 multiplies operands B and C having N-bit significands to generate an unrounded product B*C. The unrounded product B*C has an M-bit significand, where M>N. The adding circuitry 8 receives an operand A that is input at a later processing cycle than a processing cycle at which the multiplying circuitry 4 receives operands B and C. The adding circuitry 8 commences processing of the operand A after the unrounded product B*C is generated by the multiplying circuitry 4. The adding circuitry 8 adds the operand A to the unrounded product B*C and outputs a rounded result A+B*C. |
申请公布号 |
US8990282(B2) |
申请公布日期 |
2015.03.24 |
申请号 |
US200912585668 |
申请日期 |
2009.09.21 |
申请人 |
ARM Limited |
发明人 |
Lutz David Raymond |
分类号 |
G06F7/485;G06F7/487;G06F7/483;G06F7/544 |
主分类号 |
G06F7/485 |
代理机构 |
Nixon & Vanderhye P.C. |
代理人 |
Nixon & Vanderhye P.C. |
主权项 |
1. A data processing apparatus for performing a fused multiply add operation on operands A, B and C to generate a result A+B*C, said operands A, B and C and said result A+B*C being floating point values each having an N-bit significand, said data processing apparatus comprising:
multiplying circuitry configured to multiply said operand B and said operand C to generate an unrounded product B*C having an M-bit significand, where M>N; adding circuitry configured to add said unrounded product B*C to said operand A and output a rounded result A+B*C having an N-bit significand; and control circuitry responsive to a fused multiply add instruction to control said multiplying circuitry and said adding circuitry to perform said fused multiply add operation in a plurality of processing cycles; wherein said adding circuitry comprises a first input for receiving, from a register or as a result of a preceding instruction, said operand A at a later processing cycle than a processing cycle at which said operands B and C are input to said multiplying circuitry; and said adding circuitry is controlled by said control circuitry to commence processing of said operand A after said multiplying circuitry has generated said unrounded product B*C, wherein said data processing apparatus is configured to obtain said operand A from the register or the result of the preceding instruction in a later processing cycle than the processing cycle in which said operands B and C are input to said multiplying circuitry, said control circuitry is responsive to a multiply instruction to control said multiplying circuitry to multiply said operands B and C; and said data apparatus is configured to execute said multiply instruction in fewer processing cycles than said fused multiply add instruction. |
地址 |
Cambridge GB |