发明名称 |
Cooperative thread array reduction and scan operations |
摘要 |
One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread. |
申请公布号 |
US9417875(B2) |
申请公布日期 |
2016.08.16 |
申请号 |
US201314025482 |
申请日期 |
2013.09.12 |
申请人 |
NVIDIA CORPORATION |
发明人 |
Fahs Brian;Siu Ming Y.;Coon Brett W.;Nickolls John R.;Nyland Lars |
分类号 |
G06F9/30;G06F15/00;G06F9/38;G06F9/52 |
主分类号 |
G06F9/30 |
代理机构 |
Artegis Law Group, LLP |
代理人 |
Artegis Law Group, LLP |
主权项 |
1. A method for performing a scan operation across multiple threads, the method comprising:
receiving a barrier instruction that specifies the scan operation for execution by a first thread of the multiple threads; combining a value associated with the first thread with an scan result for the multiple threads; communicating the scan result to the first thread; and causing another instruction to be executed without waiting until the barrier instruction is received by a second thread of the multiple threads. |
地址 |
Santa Clara CA US |