发明名称 |
Vector math instruction execution by DSP processor approximating division and complex number magnitude |
摘要 |
A digital signal processor (DSP) includes an instruction fetch unit, an instruction decode unit, a register set and a plurality of work units in communication with the instruction decode unit. A first embodiment calculates two divisions on packed numerators and packed denominators. The DSP work units calculate indexes into a 1/d look-up table and make a final sign correction. A second embodiment calculates an approximation of a vector magnitude of a complex number x+jy. The approximation is based upon √(x2+y2)≈α*max(|x|, |y|)+β*min(|x|, |y|). The DSP work units calculate the absolute values, find the maxima and minima, and form the packed results of two vector magnitude calculations. |
申请公布号 |
US9015452(B2) |
申请公布日期 |
2015.04.21 |
申请号 |
US201012708180 |
申请日期 |
2010.02.18 |
申请人 |
Texas Instruments Incorporated |
发明人 |
Dasgupta Udayan |
分类号 |
G06F9/302;G06F9/30;G06F9/38;G06F7/48;G06F7/535;G06F7/548;G06F7/552 |
主分类号 |
G06F9/302 |
代理机构 |
|
代理人 |
Marshall, Jr. Robert B.;Cimino Frank D. |
主权项 |
1. A method for performing an approximate division on a digital signal processor having a plurality of registers, each register storing data of N bits, and a plurality of work units, each work unit performing data processing operations under instruction control, the steps comprising:
storing a first numerator operand of N/2 bits in a set of N/2 most significant bits of a first register of the plurality of registers; storing a second numerator operand of N/2 bits in a set of N/2 least significant bits of the first register; storing a first denominator operand of N/2 bits in a set of N/2 most significant bits of a second register of the plurality of registers; storing a second denominator operand of N/2 bits in a set of N/2 least significant bits of the second register; employing one of the work units to separately
form a first absolute value of the most significant bits of the second register and store the first absolute value in the most significant bits of a third register of the plurality of registers, andform a second absolute value of the least significant bits of the second register and store the second absolute value in the least significant bits of the third register; employing one of the work units to extract the first absolute value from the third register; employing one of the work units to determine a number of unused bits in the first absolute value; employing one of the work units to generate a headroom hn+1 of the first denominator operand by extracting a predetermined number of bits of the number of unused bits in the first absolute value; employing one of the work units to compute a first shift factor sn+1 for the first denominator operand by adding the first headroom hn+1 to a first constant and subtracting a number of fractional bits in the first denominator operand; employing one of the work units to extract the second absolute value from the third register; employing one of the work units to determine a number of unused bits in the second absolute value; employing one of the work units to generate a second headroom hn of the second denominator operand by extracting a predetermined number of bits of the number of unused bits in the second absolute value; employing one of the work units to compute a second shift factor sn for the second denominator operand by adding the second headroom hn to the first constant and subtracting a number of fractional bits in the second denominator operand; employing one to the work units to generate a first intermediate result by left shifting the data in the third register by an amount of the first headroom hn+1 bits; employing one of the work units to generate a first division LUT index in+1 by right shifting the first intermediate by a second constant; employing one to the work units to generating a second intermediate result by left shifting the data in the third register by an amount of the second headroom hn bits; employing one of the work units to generating a second division LUT index in by right shifting the first intermediate by a third constant; employing one of the work units to generate a third intermediate result by performing an exclusive OR of data in the first register and data in the second register; employing one of the work units to extract a most significant bit of the most significant bits of the third intermediate result; employing one of the work units to generate a sign of a first division mn+1 by subtracting twice the most significant bit of the most significant bits of the third intermediate result from 1; employing one of the work units to extract a most significant bit of the least significant bits of the third intermediate result; employing one of the work units to generate a sign of a second division mn+1 by subtracting twice the most significant bit of the least significant bits of the third intermediate result from 1; employing one of the work units to generate a first look-up table value by indexing a look-up table with the first division LUT index in+1; employing one of the work units to generate a second look-up table value by indexing a look-up table with the second division LUT index in; employing one of the work units to the store first look-up table value in most significant bits of a fourth register of the plurality of registers and store the second look-up table value in least significant bits of the fourth register; employing one of the work units to generate a first product by multiplying the most significant bits of the fourth register by the most significant bits of the third register; employing one of the work units to generate a first absolute division value by performing a saturated left shift of the first product by the headroom hn+1; employing one of the work units to generate a first division value by multiplying the first absolute division value by the sign of a first division mn+1; employing one of the work units to generate a second product by multiplying the least significant bits of the fourth register by the least significant bits of the third register; employing one of the work units to generate a second absolute division value by performing a saturated left shift of the second product by the headroom hn; employing one of the work units to generate a second division value by multiplying the second absolute division value by the sign of a second division mn; and employing one of the work units to generate a packed division by storing the first division value in most significant bits of a fifth register of the plurality of registers and storing the second division value in least significant bits of the fifth register. |
地址 |
Dallas TX US |