发明名称 System for reducing data transfer latency to a global queue by generating bit mask to identify selected processing nodes/units in multi-node data processing system
摘要 A system for efficient dispatch/completion of a work element within a multi-node data processing system. The system comprises a processor performing the functions of: selecting specific processing units from among the processing nodes to complete execution of a work element that has multiple individual work items that may be independently executed by different ones of the processing units; generating an allocated processor unit (APU) bit mask that identifies at least one of the processing units that has been selected; placing the work element in a first entry of a global command queue (GCQ); associating the APU mask with the work element in the GCQ; and responsive to receipt at the GCQ of work requests from each of the multiple processing nodes or the processing units, enabling only the selected specific ones of the processing nodes or the processing units to be able to retrieve work from the work element in the GCQ.
申请公布号 US8819690(B2) 申请公布日期 2014.08.26
申请号 US200912649667 申请日期 2009.12.30
申请人 International Business Machines Corporation 发明人 Alexander Benjamin G.;Bellows Gregory H.;Madruga Joaquin;Minor Barry L.
分类号 G06F9/46 主分类号 G06F9/46
代理机构 Yudell Isidore Ng Russell PLLC 代理人 Yudell Isidore Ng Russell PLLC
主权项 1. A data processing system comprising: a plurality of processing nodes, each processing node having at least one processing unit; a system memory construct coupled to the plurality of processing nodes via an interconnect; a global command queue (GCQ) maintained within the system memory construct during data processing operations utilizing multiple processing resources; and a scheduler performing the functions of: selectively allocating a work element having multiple individual work items to specific processing nodes or processing units from among the plurality of processing nodes in order to complete execution of the work element, wherein the multiple individual work items may be independently executed by different ones of the plurality of processing nodes and by different ones of the processing units;generating an allocated processor unit (APU) bit mask that identifies at least one of the specific processing nodes or processing units that has been selectively allocated the work element, wherein the APU bit mask further comprises a bit for each processing node and processing unit from among the plurality of processing nodes;in response to generating the APU bit mask: placing the work element in a first entry of the GCQ; andassociating the APU bit mask with the work element in the first entry of the GCQ; andin response to the GCQ receiving work requests from each of the plurality of processing nodes or the processing units, dispatching the multiple individual work items from the work element in the GCQ to only the selected specific processing nodes or processing units.
地址 Armonk NY US