发明名称 Cooperative thread array reduction and scan operations
摘要 One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.
申请公布号 US9417875(B2) 申请公布日期 2016.08.16
申请号 US201314025482 申请日期 2013.09.12
申请人 NVIDIA CORPORATION 发明人 Fahs Brian;Siu Ming Y.;Coon Brett W.;Nickolls John R.;Nyland Lars
分类号 G06F9/30;G06F15/00;G06F9/38;G06F9/52 主分类号 G06F9/30
代理机构 Artegis Law Group, LLP 代理人 Artegis Law Group, LLP
主权项 1. A method for performing a scan operation across multiple threads, the method comprising: receiving a barrier instruction that specifies the scan operation for execution by a first thread of the multiple threads; combining a value associated with the first thread with an scan result for the multiple threads; communicating the scan result to the first thread; and causing another instruction to be executed without waiting until the barrier instruction is received by a second thread of the multiple threads.
地址 Santa Clara CA US