发明名称 DYNAMICALLY DETECTING UNIFORMITY AND ELIMINATING REDUNDANT COMPUTATIONS TO REDUCE POWER CONSUMPTION
摘要 One embodiment of the present invention includes techniques to decrease power consumption by reducing the number of redundant operations performed. In operation, a streamlining multiprocessor (SM) identifies uniform groups of threads that, when executed, apply the same deterministic operation to uniform sets of input operands. Within each uniform group of threads, the SM designates one thread as the anchor thread. The SM disables execution units assigned to all of the threads except the anchor thread. The anchor execution unit, assigned to the anchor thread, executes the operation on the uniform set of input operands. Subsequently, the SM sets the outputs of the non-anchor threads included in the uniform group of threads to equal the value of the anchor execution unit output. Advantageously, by exploiting the uniformity of data to reduce the number of execution units that execute, the SM dramatically reduces the power consumption compared to conventional SMs.
申请公布号 US2015100764(A1) 申请公布日期 2015.04.09
申请号 US201314048647 申请日期 2013.10.08
申请人 NVIDIA CORPORATION 发明人 TAROLLI Gary M.;EDMONDSON John H.;BURGESS John Matthew;OHANNESSIAN Robert
分类号 G06F9/30 主分类号 G06F9/30
代理机构 代理人
主权项 1. A system configured to eliminate redundant computations, the system comprising: a memory that includes a first set of operands associated with a first thread and a second set of operands associated with a second thread; and a streaming multiprocessor coupled to the memory and configured to: determine that both the first thread and the second thread are configured to execute a first deterministic operator;determine that a value of each operand included in the first set of operands equals a value of a corresponding operand included in the second set of operands;in response, activate a first uniformity signal;cause the first thread to execute the first deterministic operator on the first set of operands to generate a first output; andcause the second thread to set a second output equal to the first output without executing the first deterministic operator on the second set of operands.
地址 SANTA CLARA CA US