发明名称 Vector math instruction execution by DSP processor approximating division and complex number magnitude
摘要 A digital signal processor (DSP) includes an instruction fetch unit, an instruction decode unit, a register set and a plurality of work units in communication with the instruction decode unit. A first embodiment calculates two divisions on packed numerators and packed denominators. The DSP work units calculate indexes into a 1/d look-up table and make a final sign correction. A second embodiment calculates an approximation of a vector magnitude of a complex number x+jy. The approximation is based upon √(x2+y2)≈α*max(|x|, |y|)+β*min(|x|, |y|). The DSP work units calculate the absolute values, find the maxima and minima, and form the packed results of two vector magnitude calculations.
申请公布号 US9015452(B2) 申请公布日期 2015.04.21
申请号 US201012708180 申请日期 2010.02.18
申请人 Texas Instruments Incorporated 发明人 Dasgupta Udayan
分类号 G06F9/302;G06F9/30;G06F9/38;G06F7/48;G06F7/535;G06F7/548;G06F7/552 主分类号 G06F9/302
代理机构 代理人 Marshall, Jr. Robert B.;Cimino Frank D.
主权项 1. A method for performing an approximate division on a digital signal processor having a plurality of registers, each register storing data of N bits, and a plurality of work units, each work unit performing data processing operations under instruction control, the steps comprising: storing a first numerator operand of N/2 bits in a set of N/2 most significant bits of a first register of the plurality of registers; storing a second numerator operand of N/2 bits in a set of N/2 least significant bits of the first register; storing a first denominator operand of N/2 bits in a set of N/2 most significant bits of a second register of the plurality of registers; storing a second denominator operand of N/2 bits in a set of N/2 least significant bits of the second register; employing one of the work units to separately form a first absolute value of the most significant bits of the second register and store the first absolute value in the most significant bits of a third register of the plurality of registers, andform a second absolute value of the least significant bits of the second register and store the second absolute value in the least significant bits of the third register; employing one of the work units to extract the first absolute value from the third register; employing one of the work units to determine a number of unused bits in the first absolute value; employing one of the work units to generate a headroom hn+1 of the first denominator operand by extracting a predetermined number of bits of the number of unused bits in the first absolute value; employing one of the work units to compute a first shift factor sn+1 for the first denominator operand by adding the first headroom hn+1 to a first constant and subtracting a number of fractional bits in the first denominator operand; employing one of the work units to extract the second absolute value from the third register; employing one of the work units to determine a number of unused bits in the second absolute value; employing one of the work units to generate a second headroom hn of the second denominator operand by extracting a predetermined number of bits of the number of unused bits in the second absolute value; employing one of the work units to compute a second shift factor sn for the second denominator operand by adding the second headroom hn to the first constant and subtracting a number of fractional bits in the second denominator operand; employing one to the work units to generate a first intermediate result by left shifting the data in the third register by an amount of the first headroom hn+1 bits; employing one of the work units to generate a first division LUT index in+1 by right shifting the first intermediate by a second constant; employing one to the work units to generating a second intermediate result by left shifting the data in the third register by an amount of the second headroom hn bits; employing one of the work units to generating a second division LUT index in by right shifting the first intermediate by a third constant; employing one of the work units to generate a third intermediate result by performing an exclusive OR of data in the first register and data in the second register; employing one of the work units to extract a most significant bit of the most significant bits of the third intermediate result; employing one of the work units to generate a sign of a first division mn+1 by subtracting twice the most significant bit of the most significant bits of the third intermediate result from 1; employing one of the work units to extract a most significant bit of the least significant bits of the third intermediate result; employing one of the work units to generate a sign of a second division mn+1 by subtracting twice the most significant bit of the least significant bits of the third intermediate result from 1; employing one of the work units to generate a first look-up table value by indexing a look-up table with the first division LUT index in+1; employing one of the work units to generate a second look-up table value by indexing a look-up table with the second division LUT index in; employing one of the work units to the store first look-up table value in most significant bits of a fourth register of the plurality of registers and store the second look-up table value in least significant bits of the fourth register; employing one of the work units to generate a first product by multiplying the most significant bits of the fourth register by the most significant bits of the third register; employing one of the work units to generate a first absolute division value by performing a saturated left shift of the first product by the headroom hn+1; employing one of the work units to generate a first division value by multiplying the first absolute division value by the sign of a first division mn+1; employing one of the work units to generate a second product by multiplying the least significant bits of the fourth register by the least significant bits of the third register; employing one of the work units to generate a second absolute division value by performing a saturated left shift of the second product by the headroom hn; employing one of the work units to generate a second division value by multiplying the second absolute division value by the sign of a second division mn; and employing one of the work units to generate a packed division by storing the first division value in most significant bits of a fifth register of the plurality of registers and storing the second division value in least significant bits of the fifth register.
地址 Dallas TX US