发明名称 APPLICATION PROGRAMMING INTERFACES FOR DATA PARALLEL COMPUTING ON MULTIPLE PROCESSORS
摘要 A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.
申请公布号 US2016188371(A1) 申请公布日期 2016.06.30
申请号 US201514977204 申请日期 2015.12.21
申请人 Apple Inc. 发明人 Munshi Aaftab;Begeman Nathaniel
分类号 G06F9/50 主分类号 G06F9/50
代理机构 代理人
主权项 1. A computerized method comprising: receiving, from a host application executing on a host processor, a request to execute a compute program on a selected compute device, the compute program corresponding to source code for a task in the host application and comprising a compute kernel containing executable code for the task, wherein the selected compute device corresponds to a compute identifier selected by the host application selected by the host application from one or more compute identifiers received by the host application from the host processor in response to a previous request from the host application to the host processor, the previous request including a processing requirement specified by the host application for the task to cause the host processor to identify compute devices that match the processing requirement for the task and to return compute identifiers for matching compute devices to the host application, the compute devices comprising at least one central processing unit (CPU) and at least one graphics processing unit (GPU) physically coupled to the host processor; if a compute library does not exist, or the compute program does not exist in an existing compute library, or the compute program in the existing compute library is not optimized for the selected compute device, creating, by the host processor, the compute program for execution on the selected compute device, including compiling the compute kernel from the source; and scheduling, by the host processor, the compute kernel for execution.
地址 Cupertino CA US