发明名称 Combining compute tasks for a graphics processing unit
摘要 Methods, systems and devices are disclosed to examine developer supplied graphics code and attributes at run-time. The graphics code designed for execution on a graphics processing unit (GPU) utilizing a coding language such as OpenCL or OpenGL which provides for run-time analysis by a driver, code generator, and compiler. Developer supplied code and attributes can be analyzed and altered based on the execution capabilities and performance criteria of a GPU on which the code is about to be executed. In general, reducing the number of developer defined work items or work groups can reduce the initialization cost of the GPU with respect to the work to be performed and result in an overall optimization of the machine code. Manipulation code can be added to adjust the supplied code in a manner similar to unrolling a loop to improve execution performance.
申请公布号 US9442706(B2) 申请公布日期 2016.09.13
申请号 US201414448927 申请日期 2014.07.31
申请人 Apple Inc. 发明人 Avkarogullari Gokhan;Kan Alexander K.;Chiu Kelvin C.
分类号 G06F9/45;G06F9/445 主分类号 G06F9/45
代理机构 Blank Rome LLP 代理人 Blank Rome LLP
主权项 1. A method of processing code, the method comprising: obtaining a portion of the code that includes a first work group having a first work item, a second work group having a second work item, and attributes containing developer supplied criteria, the code and attributes describing execution parameters for a graphics processing unit (GPU), and each of the first and second work items having an execution overhead; analyzing the portion of the code and attributes with at least one of a compiler or driver; generating an altered code portion based on capabilities of the GPU by merging the first work group with the second work group to form a single merged work group that includes the first and second work items; coalescing the first and second work items of the altered code portion to form an altered and coalesced code portion, the altered and coalesced code portion comprising manipulation code to automatically adjust, at run-time, code elements affected by the coalescing, wherein a cost of executing the manipulation code is less than a cost of executing eliminated execution overhead resulting from the coalescing; compiling the altered and coalesced code portion; and loading the compiled altered and coalesced code portion for execution on the GPU.
地址 Cupertino CA US
您可能感兴趣的专利