发明名称 Scheduling synchronization in association with collective operations in a parallel computer
摘要 Methods, apparatuses, and computer program products for scheduling synchronization in association with collective operations in a parallel computer that includes a shared memory and a plurality of compute nodes that execute a parallel application utilizing the shared memory are provided. Embodiments include acquiring an available channel of the shared memory; posting to the acquired channel of the shared memory one or more collective operations and a synchronization point; determining that processing within the acquired channel has reached the synchronization point; and posting to the acquired channel, in response to determining that processing within the acquired channel has reached the synchronization point, a background synchronization operation corresponding to the one or more collective operations.
申请公布号 US8869168(B2) 申请公布日期 2014.10.21
申请号 US201213470932 申请日期 2012.05.14
申请人 International Business Machines Corporation 发明人 Archer Charles J.;Blocksome Michael A.;Ratterman Joseph D.;Smith Brian E.
分类号 G06F9/44;G06F15/167 主分类号 G06F9/44
代理机构 Biggers Kennedy Lenart Spraggins LLP 代理人 Biggers Kennedy Lenart Spraggins LLP
主权项 1. A method of scheduling synchronization in association with collective operations in a parallel computer, the parallel computer comprising a shared memory and a plurality of compute nodes that execute a parallel application utilizing the shared memory, the method comprising: determining whether any channels of the shared memory are available; if a channel of the shared memory is not available, monitoring the shared memory for an indication that a channel of the shared memory is available and storing within a queue associated with the shared memory, one or more collective operations; if a particular channel of the shared memory is available, selecting the available channel for acquisition; acquiring, by the parallel application, the available channel of the shared memory; posting to the acquired channel of the shared memory, by the parallel application, the one or more collective operations and a synchronization point, including transferring the one or more collective operations from the queue to the acquired channel of the shared memory; determining, by the parallel application, that processing within the acquired channel has reached the synchronization point; and in response to determining that processing within the acquired channel has reached the synchronization point, posting to the acquired channel, by the parallel application, a background synchronization operation corresponding to the one or more collective operations.
地址 Armonk NY US