发明名称 Constructing a logical tree topology in a parallel computer
摘要 Constructing a logical tree topology in a parallel computer that includes compute nodes, where each node executes a number of tasks and at least one node executes a number of tasks different from another node includes: identifying a compute node executing a greatest number of tasks; selecting, as a global root, a task from the identified compute node, including assigning the task as a local root of the identified compute node and assigning each of the other tasks of the identified compute node as a child of the local root; selecting, from each of the other compute nodes, one task to be a local root, including assigning each task other than the local root as a child of the local root; and assigning each local root of the other compute nodes to be a child of one of the tasks of the identified compute node other than the global root.
申请公布号 US9348651(B2) 申请公布日期 2016.05.24
申请号 US201314098162 申请日期 2013.12.05
申请人 International Business Machines Corporation 发明人 Archer Charles J.;K. A. Nysal Jan;Sharkawi Sameh S.
分类号 G06F9/46;G06F9/50;H04L12/751;H04L29/08 主分类号 G06F9/46
代理机构 Kennedy Lenart Spraggins LLP 代理人 Lenart Edward J.;Cabrasawan Feb;Kennedy Lenart Spraggins LLP
主权项 1. An apparatus for constructing a logical tree topology in a parallel computer, the parallel computer comprising a plurality of compute nodes, each compute node executing a number of tasks, wherein at least one compute node executes a number of tasks that is different than the number of tasks executed by another one of the compute nodes, the apparatus comprising a computer processor, a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions that, when executed by the computer processor, cause the apparatus to carry out: identifying a compute node from the plurality of compute nodes executing a greatest number of tasks; selecting, as a global root in a logical tree topology, a task from the identified compute node, including assigning the task as a local root of the identified compute node and assigning each of the other tasks of the identified compute node as a child of the local root; selecting, from each of the other compute nodes, one task to be a local root in the logical tree topology, including assigning each task other than the local root as a child of the local root; and assigning each local root of the other compute nodes to be a child of one of the tasks of the identified compute node other than the global root; performing a reduce operation in the logical tree topology including: transmitting, by each task of a compute node other than the identified compute node, reduce data to the local root of the compute node through shared memory, including performing a reduction operation on the reduce data; transmitting, in parallel by each local root of a compute node other than the identified compute node, the reduce data to the task of the identified compute node assigned to be a parent of the local root, including performing the reduction operation on the reduce data; and transmitting, through shared memory by each of the tasks of the identified compute node, the reduce data to the global root including performing the reduction operation on the reduce data.
地址 Armonk NY US