发明名称 Collective operation protocol selection in a parallel computer
摘要 Collective operation protocol selection in a parallel computer that includes compute nodes may be carried out by calling a collective operation with operating parameters; selecting a protocol for executing the operation and executing the operation with the selected protocol. Selecting a protocol includes: iteratively, until a prospective protocol meets predetermined performance criteria: providing, to a protocol performance function for the prospective protocol, the operating parameters; determining whether the prospective protocol meets predefined performance criteria by evaluating a predefined performance fit equation, calculating a measure of performance of the protocol for the operating parameters; determining that the prospective protocol meets predetermined performance criteria and selecting the protocol for executing the operation only if the calculated measure of performance is greater than a predefined minimum performance threshold.
申请公布号 US8893083(B2) 申请公布日期 2014.11.18
申请号 US201113206116 申请日期 2011.08.09
申请人 International Business Machines Coporation 发明人 Archer Charles J.;Blocksome Michael A.;Ratterman Joseph D.;Smith Brian E.
分类号 G06F9/44;G06F9/45;G06F9/38;G06F15/78;G06F9/50;G06F11/34 主分类号 G06F9/44
代理机构 Biggers Kennedy Lenart Spraggins LLP 代理人 Biggers Kennedy Lenart Spraggins LLP
主权项 1. An apparatus for collective operation protocol selection in a parallel computer, the parallel computer comprising a plurality of compute nodes, the apparatus comprising a computer processor, a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions that, when executed by the computer processor, cause the apparatus to carry out the steps of: calling a collective operation with one or more operating parameters; selecting one of a plurality of protocols that define execution of the collective operation, including, iteratively, for each protocol beginning with a first prospective protocol until a prospective protocol meets predetermined performance criteria: providing, to a protocol performance function for the prospective protocol, the operating parameters of the collective operation;determining, by the performance function, whether the prospective protocol meets predefined performance criteria for the operating parameters, including evaluating, with the operating parameters, a predefined performance fit equation for the prospective protocol, calculating a measure of performance of the prospective protocol for the operating parameters, and determining that the prospective protocol meets predetermined performance criteria; andselecting the prospective protocol as the protocol for executing the collective operation only if the calculated measure of performance is greater than a predefined minimum performance threshold; and executing the collective operation with the selected protocol.
地址 Armonk NY US