发明名称 N-way memory barrier operation coalescing
摘要 One embodiment sets forth a technique for N-way memory barrier operation coalescing. When a first memory barrier is received for a first thread group execution of subsequent memory operations for the first thread group are suspended until the first memory barrier is executed. Subsequent memory barriers for different thread groups may be coalesced with the first memory barrier to produce a coalesced memory barrier that represents memory barrier operations for multiple thread groups. When the coalesced memory barrier is being processed, execution of subsequent memory operations for the different thread groups is also suspended. However, memory operations for other thread groups that are not affected by the coalesced memory barrier may be executed.
申请公布号 US8997103(B2) 申请公布日期 2015.03.31
申请号 US201213441785 申请日期 2012.04.06
申请人 NVIDIA Corporation 发明人 Gadre Shirish;McCarver Charles;Rajendran Anjana;Paranjape Omkar;Heinrich Steven James
分类号 G06F9/46;G06F1/04;G06F7/00;G06F9/38;G06F9/30;G06F9/52 主分类号 G06F9/46
代理机构 Artegis Law Group, LLP 代理人 Artegis Law Group, LLP
主权项 1. A computer-implemented method for processing memory barrier instructions, the method comprising: receiving a first memory barrier instruction for a first thread group that includes multiple parallel execution threads; blocking the execution of memory transactions for the first thread group that are subsequent to the first memory barrier instruction in program order; receiving, subsequent to the first memory barrier instruction, a first set of memory transactions and a second memory barrier instruction for at least a second thread group that includes multiple execution threads; coalescing the first memory barrier instruction and the second memory barrier instruction to generate a coalesced memory barrier instruction; tagging each transaction in the first set of memory transactions with a first coalescing index associated with the coalesced memory barrier instruction to generate tagged memory commands; combining the tagged memory commands and the coalesced memory barrier instruction to generate a tagged memory command stream; transmitting the tagged memory command stream and memory transactions for the first thread group that are prior to the first memory barrier instruction in program order to a memory management unit to process the memory transactions for the first thread group that are prior to the first memory barrier instruction in program order, the first set of memory transactions, the first memory barrier instruction, and the second memory barrier instruction; determining that the memory transactions for the first thread group that are prior to the first memory barrier instruction in program order and the first set of memory transactions are committed to memory; and releasing both the first memory barrier instruction to allow the memory transactions for the first thread group that are subsequent to the first memory barrier instruction in program order to be executed and the second memory barrier instruction to allow the memory transactions for the second thread group that are subsequent to the second memory barrier instruction in program order to be executed.
地址 Santa Clara CA US