发明名称 APPARATUS AND METHOD FOR CONTROLLING ROUNDING WHEN PERFORMING A FLOATING POINT OPERATION
摘要 An apparatus and method are provided for controlling rounding when performing a floating point operation. The apparatus has argument reduction circuitry to perform an argument reduction operation, and in addition provides reduce and round circuitry that generates from a supplied floating point value a modified floating point value to be input to the argument reduction circuitry. The reduce and round circuitry is arranged to modify a significand of the supplied floating point value, based on a specified value N, in order to produce a truncated significand with a specified rounding applied, the truncated significand being N bits shorter than the significand of the supplied floating point value, and then being used as a significand for the modified floating point value. The specified value N is chosen such that the argument reduction operation performed using the modified floating point value will inhibit roundoff error in a result of the argument reduction operation. By enabling roundoff error to be inhibited in such a way, it is possible to use such argument reduction circuitry in the computation of a number of floating point operations whilst enabling the correct rounded result to be obtained.
申请公布号 US2017010863(A1) 申请公布日期 2017.01.12
申请号 US201615156379 申请日期 2016.05.17
申请人 ARM LIMITED 发明人 NYSTAD JØrn
分类号 G06F7/499;G06F7/483 主分类号 G06F7/499
代理机构 代理人
主权项 1. An apparatus comprising: argument reduction circuitry to perform an argument reduction operation; and reduce and round circuitry to generate from a supplied floating point value a modified floating point value to be input to the argument reduction circuitry; the reduce and round circuitry being arranged to modify a significand of the supplied floating point value, based on a specified value N, in order to produce a truncated significand with a specified rounding applied, the truncated significand being N bits shorter than the significand of the supplied floating point value, and being used as a significand for the modified floating point value; the specified value N being such that the argument reduction operation performed using the modified floating point value inhibits roundoff error in a result of the argument reduction operation.
地址 Cambridge GB