发明名称 Systems and methods for data-parallel processing
摘要 Methods, systems, and mediums are described for scheduling data parallel tasks onto multiple thread execution units of processing system. Embodiments of a lock-free queue structure and methods of operation are described to implement a method for scheduling fine-grained data-parallel tasks for execution in a computing system. The work of one of a plurality of worker threads is wait-free with respect to the other worker threads. Each node of the queue holds a reference to a task that may be concurrently performed by multiple thread execution units, but each on a different subset of data. Various embodiments relate to software-based scheduling of data-parallel tasks on a multi-threaded computing platform that does not perform such scheduling in hardware. Other embodiments are also described and claimed.
申请公布号 US8954986(B2) 申请公布日期 2015.02.10
申请号 US201012971891 申请日期 2010.12.17
申请人 Intel Corporation 发明人 Rajagopalan Mohan;Adl-Tabatabai Ali-Reza;Ni Yang;Welc Adam;Hudson Richard L.
分类号 G06F9/46;G05B19/18;G06F7/38;G06F9/48;G06F9/50 主分类号 G06F9/46
代理机构 Schwabe, Williamson & Wyatt, P.C. 代理人 Schwabe, Williamson & Wyatt, P.C.
主权项 1. A system, comprising: a processing system having a plurality of thread execution units to execute a plurality of threads for concurrent data processing; at least one memory element coupled to the processing system, wherein the memory element is to store a data parallel queue having a structure to hold one or more nodes, individual nodes include a plurality of work items, and each work item corresponds to a different subset of a set of data, and wherein the data parallel queue is a singly-linked linked list; and scheduling logic to cause multiple different threads to concurrently access the data parallel queue to pick corresponding multiple different work items to execute, and to maintain the multiple different work items in the data parallel queue while the multiple different threads execute the corresponding multiple different work items, wherein the multiple different threads execute the corresponding multiple different work items by each performing a same set of instructions on the corresponding multiple different subsets of the set of data; said scheduling logic further comprising: logic to direct a thread to a task scheduling queue if the data parallel queue is empty;logic to perform a lock-free enqueue of a new node into the data parallel queue; andlogic to perform a wait-free de-queue of a particular node from the data parallel queue in response to completion of all work items of the particular node and concurrently with the performance of the lock-free enqueue of the new node.
地址 Santa Clara CA US