发明名称 Fine-Grained Heterogeneous Computing
摘要 A heterogeneous computing system described herein has an energy-efficient architecture that exploits producer-consumer locality, task parallelism and data parallelism. The heterogeneous computing system includes a task frontend that dispatches tasks and updated tasks from queues for execution based on properties associated with the queues, and execution units that include a first subset acting as producers to execute the tasks and generate the updated tasks, and a second subset acting as consumers to execute the updated tasks. The execution units includes one or more control processors to perform control operations, vector processors to perform vector operations, and accelerators to perform multimedia signal processing operations. The heterogeneous computing system also includes a memory backend containing the queues to store the tasks and the updated tasks for execution by the execution units.
申请公布号 US2017068571(A1) 申请公布日期 2017.03.09
申请号 US201514845647 申请日期 2015.09.04
申请人 MediaTek Inc. 发明人 LU Chien-Ping;HUANG Hsilin
分类号 G06F9/50 主分类号 G06F9/50
代理机构 代理人
主权项 1. A heterogeneous computing system comprising: one or more central processing units (CPUs) to determine, based on dependency among tasks, how many queues to generate and a mapping of the queues to a plurality of execution units; a task frontend to receive an initial assignment of the tasks from the one or more CPUs, to dispatch the tasks and updated tasks from the queues for execution based on the mapping of the queues, and to manage, without intervention from the one or more CPUs, self-enqueue in which a consumer of a first task is same as a producer of the first task, and cross-enqueue in which a consumer of a second task is different from a producer of the second task; the plurality of execution units that include a first subset acting as producers to execute the tasks and generate the updated tasks, and a second subset acting as consumers to execute the updated tasks, wherein different execution units of the second subset receive the updated tasks via the task frontend from different queues, and wherein the execution units include one or more control processors to perform control operations, vector processors to perform vector operations, and accelerators to perform multimedia signal processing operations; and a memory backend containing the queues to store the tasks and the updated tasks for execution by the execution units.
地址 Hsinchu TW