发明名称 |
APPARATUS AND METHOD FOR EFFICIENT PREFIX SUM OPERATION |
摘要 |
An apparatus and method are described for performing a prefix sum. For example, one embodiment of an apparatus comprises: a graphics processor unit comprising one or more execution units to execute single instruction multiple data (SIMD) instructions, the GPU to be provided with a plurality of data elements as input for a prefix sum operation; a first register of the GPU to store the plurality of data elements in specified data element positions; and the one or more execution units to perform a series of single instruction multiple data (SIMD) operations using the plurality of data elements, the SIMD operations performed using regioning techniques to generate the prefix sum, the SIMD operations including a first plurality of simultaneous addition operations to add specified data elements to generate intermediate results and further including a second plurality of simultaneous addition operations to add the intermediate results to other intermediate results to generate the prefix sum. |
申请公布号 |
US2016350262(A1) |
申请公布日期 |
2016.12.01 |
申请号 |
US201514727826 |
申请日期 |
2015.06.01 |
申请人 |
SARANGI SATYAJIT;RAOUX THOMAS F. |
发明人 |
SARANGI SATYAJIT;RAOUX THOMAS F. |
分类号 |
G06F15/80;G06T1/20;G06F9/30 |
主分类号 |
G06F15/80 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method comprising:
providing a plurality of data elements to a graphics processor unit (GPU) for input to a prefix sum operation; storing the plurality of data elements in specified data element positions of a first register of the GPU; and performing a series of single instruction multiple data (SIMD) operations using the plurality of data elements, the SIMD operations performed using regioning techniques to generate the prefix sum, the SIMD operations including a first plurality of simultaneous addition operations to add specified data elements to generate intermediate results and further including a second plurality of simultaneous addition operations to add the intermediate results to other intermediate results to generate the prefix sum. |
地址 |
Santa Clara CA US |