发明名称 Vector floating point argument reduction
摘要 A processing apparatus is provided with processing circuitry 6, 8 and decoder circuitry 10 responsive to a received argument reduction instruction FREDUCE4, FDOT3R to generate control signals 16 for controlling the processing circuitry 6, 8. The action of the argument reduction instruction is to subject each component of an input vector to a scaling which adds or subtracts an exponent shift value C to the exponent of the input vector component. The exponent shift value C is selected such that a sum of this exponent shift value C with the maximum exponent value B of any of the input vector components lies within a range between a first predetermined value and a second predetermined value. A consequence of execution of this argument reduction instruction is that the result vector when subject to a dot-product operation will be resistant to floating point underflows or overflows.
申请公布号 US9146901(B2) 申请公布日期 2015.09.29
申请号 US201113137576 申请日期 2011.08.26
申请人 ARM Limited 发明人 Nystad Jorn
分类号 G06F17/10;G06F7/483;G06F7/552;G06F5/01;G06F9/30 主分类号 G06F17/10
代理机构 Nixon & Vanderhye P.C. 代理人 Nixon & Vanderhye P.C.
主权项 1. Apparatus for processing data comprising: processing circuitry configured to perform processing operations upon data values; and decoder circuitry coupled to said processing circuitry and configured to decode program instructions to generate control signals for controlling said processing circuitry to perform processing operations specified by said program instructions; wherein said decoder circuitry is responsive to an argument reduction instruction to generate control signals to control said processing circuitry to perform a processing operation upon a vector floating point value having a plurality of components, each of said plurality of components including an integer exponent value and a mantissa value, said processing operation including generating a plurality of result components, the processing operation comprising: for each of said plurality of components, forming a high order exponent portion Eho being an uppermost P bits of said integer exponent value, where P is less than a total number of bits within said integer exponent value, andselecting a highest value Ehomax from among said high order exponent portions Eho,wherein Ehomax identifies a highest integer exponent value B of said plurality of components;selecting an exponent shift value C such that (B+C) is less than a first predetermined value Edotmax and (B+C) is greater than a second predetermined value Edotmin, where said exponent shift value C is an integer value; andfor each of said plurality of components, if said exponent shift value C is non-zero, then adding a value of (2(P−1)−Ehomax) to said high order exponent portion Eho to generate one of said plurality of result components.
地址 Cambridge GB