发明名称 |
APPARATUS AND METHOD FOR CONTROLLING ROUNDING WHEN PERFORMING A FLOATING POINT OPERATION |
摘要 |
An apparatus and method are provided for controlling rounding when performing a floating point operation. The apparatus has argument reduction circuitry to perform an argument reduction operation, and in addition provides reduce and round circuitry that generates from a supplied floating point value a modified floating point value to be input to the argument reduction circuitry. The reduce and round circuitry is arranged to modify a significand of the supplied floating point value, based on a specified value N, in order to produce a truncated significand with a specified rounding applied, the truncated significand being N bits shorter than the significand of the supplied floating point value, and then being used as a significand for the modified floating point value. The specified value N is chosen such that the argument reduction operation performed using the modified floating point value will inhibit roundoff error in a result of the argument reduction operation. By enabling roundoff error to be inhibited in such a way, it is possible to use such argument reduction circuitry in the computation of a number of floating point operations whilst enabling the correct rounded result to be obtained. |
申请公布号 |
US2017010863(A1) |
申请公布日期 |
2017.01.12 |
申请号 |
US201615156379 |
申请日期 |
2016.05.17 |
申请人 |
ARM LIMITED |
发明人 |
NYSTAD JØrn |
分类号 |
G06F7/499;G06F7/483 |
主分类号 |
G06F7/499 |
代理机构 |
|
代理人 |
|
主权项 |
1. An apparatus comprising:
argument reduction circuitry to perform an argument reduction operation; and reduce and round circuitry to generate from a supplied floating point value a modified floating point value to be input to the argument reduction circuitry; the reduce and round circuitry being arranged to modify a significand of the supplied floating point value, based on a specified value N, in order to produce a truncated significand with a specified rounding applied, the truncated significand being N bits shorter than the significand of the supplied floating point value, and being used as a significand for the modified floating point value; the specified value N being such that the argument reduction operation performed using the modified floating point value inhibits roundoff error in a result of the argument reduction operation. |
地址 |
Cambridge GB |