发明名称 VECTOR PROCESSING ENGINES (VPEs) EMPLOYING A TAPPED-DELAY LINE(S) FOR PROVIDING PRECISION FILTER VECTOR PROCESSING OPERATIONS WITH REDUCED SAMPLE RE-FETCHING AND POWER CONSUMPTION, AND RELATED VECTOR PROCESSOR SYSTEMS AND METHODS
摘要 Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption are disclosed. Related vector processor systems and methods are also disclosed. The VPEs are configured to provide filter vector processing operations. To minimize re-fetching of input vector data samples from memory to reduce power consumption, a tapped-delay line(s) is included in the data flow paths between a vector data file and execution units in the VPE. The tapped-delay line(s) is configured to receive and provide input vector data sample sets to execution units for performing filter vector processing operations. The tapped-delay line(s) is also configured to shift the input vector data sample set for filter delay taps and provide the shifted input vector data sample set to execution units, so the shifted input vector data sample set does not have to be re-fetched during filter vector processing operations.
申请公布号 US2015143078(A1) 申请公布日期 2015.05.21
申请号 US201314082075 申请日期 2013.11.15
申请人 QUALCOMM Incorporated 发明人 Khan Raheel;Mujahid Fahad Ali;Shiravi Afshin
分类号 G06F9/30 主分类号 G06F9/30
代理机构 代理人
主权项 1. A vector processing engine (VPE) configured to provide a filter vector processing operation, comprising: at least one vector data file configured to: provide an input vector data sample set in at least one input data flow path for a filter vector processing operation; andreceive a resultant filtered output vector data sample set from at least one output data flow path; andstore the resultant filtered output vector data sample set; at least one tapped-delay line provided between the at least one vector data file and at least one execution unit in the at least one input data flow path, the at least one tapped-delay line configured to shift the input vector data sample set by a vector data sample width in a plurality of pipeline registers for each processing stage among a plurality of processing stages equal to a number of filter taps in the filter vector processing operation, to provide a shifted input vector data sample set for each processing stage among the plurality of processing stages; and the at least one execution unit provided in the at least one input data flow path, comprising: at least one multiplier configured to apply a filter tap operation on the shifted input vector data sample set for each processing stage among the plurality of processing stages, to generate a filter tap output vector data sample set for each filter tap of the filter vector processing operation; andat least one accumulator configured to accumulate the filter tap output vector data sample sets in the at least one accumulator for each processing stage among the plurality of processing stages; the at least one execution unit configured to provide the resultant filtered output vector data sample set on the at least one output data flow path.
地址 San Diego CA US