发明名称 Vector processing engines having programmable data path configurations for providing multi-mode radix-2<sup>x </sup>butterfly vector processing circuits, and related vector processors, systems, and methods
摘要 Vector processing engines (VPEs) having programmable data path configurations for providing multi-mode Radix-2X butterfly vector processing circuits. Related vector processors, systems, and methods are also disclosed. The VPEs disclosed herein include a plurality of vector processing stages each having vector processing blocks that have programmable data path configurations for performing Radix-2X butterfly vector operations to perform Fast Fourier Transform (FFT) vector processing operations efficiently. The data path configurations of the vector processing blocks can be programmed to provide different types of Radix-2X butterfly vector operations as well as other arithmetic logic vector operations. As a result, fewer VPEs can provide desired Radix-2X butterfly vector operations and other types arithmetic logic vector operations in a vector processor, thus saving area in the vector processor while still retaining vector processing advantages of fewer register writes and faster vector instruction execution times over scalar processing engines.
申请公布号 US9275014(B2) 申请公布日期 2016.03.01
申请号 US201313798599 申请日期 2013.03.13
申请人 QUALCOMM Incorporated 发明人 Khan Raheel
分类号 G06F17/14;G06F9/30;G06F9/38;G06F7/544;G06F15/80 主分类号 G06F17/14
代理机构 Novak Druce Connolly Bove + Quigg LLP 代理人 Novak Druce Connolly Bove + Quigg LLP
主权项 1. A vector processing engine (VPE) configured to provide at least one multi-mode Radix-2x butterfly vector processing circuit, comprising: at least one multiply vector processing stage comprising at least one multiplier block configured to: receive a Radix vector data input sample set from a plurality of Radix vector data input sample sets from a first input data path among a plurality of input data paths;multiply the Radix vector data input sample set with a twiddle factor component to provide a Radix vector multiply output sample set in a plurality of multiply output data paths based on a programmable multiply data path configuration according to a Radix butterfly vector instruction executed by the at least one multiply vector processing stage; and at least one accumulation vector processing stage comprising a plurality of accumulator blocks, each accumulator block among the plurality of accumulator blocks configured to: receive a plurality of Radix vector multiply output sample sets from a multiply output data path among the plurality of multiply output data paths based on a programmable accumulator data path configuration;apply a twiddle factor input to the received plurality of Radix vector multiply output sample sets based on the programmable accumulator data path configuration;accumulate the plurality of Radix vector multiply output sample sets with the applied twiddle factor input to provide a Radix vector accumulated result sample set based on the programmable accumulator data path configuration; andprovide the Radix vector accumulated result sample set in an output data path among a plurality of output data paths; and an output processing stage configured to receive a plurality of Radix vector accumulated result sample sets from each of the plurality of accumulator blocks; wherein the plurality of multiply output data paths are programmable to fuse the at least one multiplier block to the plurality of accumulator blocks to form the least one multi-mode Radix-2x butterfly vector processing circuit.
地址 San Diego CA US