发明名称 System, method, and computer program product for bulk synchronous binary program translation and optimization
摘要 A system, method, and computer program product are provided for. The method includes the steps of executing a block of translated binary instructions by multiple threads and gathering profiling data during execution of the block of translated binary instructions. The multiple threads are then synchronized at a barrier instruction associated with the block of translated binary instructions and the block of translated binary instructions is replaced with optimized binary instructions, where the optimized binary instructions are produced based on the profiling data.
申请公布号 US9207919(B2) 申请公布日期 2015.12.08
申请号 US201414158749 申请日期 2014.01.17
申请人 NVIDIA Corporation 发明人 Diamos Gregory Frederick
分类号 G06F9/45;G06F9/30 主分类号 G06F9/45
代理机构 Zilka-Kotab, PC 代理人 Zilka-Kotab, PC
主权项 1. A method comprising: executing, on a parallel processor, a block of translated binary instructions by multiple threads; gathering profiling data during execution of the block of translated binary instructions; synchronizing the multiple threads at a barrier instruction associated with the block of translated binary instructions, wherein the barrier instruction specifies a barrier hierarchy level; replacing the block of translated binary instructions with optimized binary instructions, wherein the optimized binary instructions are produced based on the profiling data; determining a lower level barrier than the specified barrier hierarchy level is supported; and comparing the optimized binary instructions with one or more versions of binary instructions for the block that are associated with different multiple threads.
地址 Santa Clara CA US