发明名称 Distributing an executable job load file to compute nodes in a parallel computer
摘要 Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications link over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.
申请公布号 US9444908(B2) 申请公布日期 2016.09.13
申请号 US201414303208 申请日期 2014.06.12
申请人 International Business Machines Corporation 发明人 Gooding Thomas M.
分类号 G06F15/16;H04L29/08;G06F9/52;G06F9/50 主分类号 G06F15/16
代理机构 Kennedy Lenart Spraggins LLP 代理人 Lenart Edward J.;Cabrasawan Feb;Kennedy Lenart Spraggins LLP
主权项 1. A method of distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes coupled for data communications over a data communications network, the method comprising: iteratively for a predetermined number of iterations: determining, by a compute node in the parallel computer, whether the compute node is participating in a job;determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job;responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications link over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job and all data communications links between each of the compute nodes participating in the job, wherein each compute node in the parallel computer includes a routing table that associates a class route identifier with one or more egress ports on the compute node for forwarding a message received by the compute node that includes the class route identifier; and broadcasting the executable load file for the job along the class route for the job, wherein the executable load file is included in a message that also includes a class route identifier for the job, including identifying, by each compute node participating in the job, from the routing table in the compute node, an egress port within the compute node to utilize when forwarding the message.
地址 Armonk NY US