发明名称 ACCELERATING RECOVERY IN MPI ENVIRONMENTS
摘要 A method, system, and computer usable program product for accelerating recovery in an MPI environment are provided in the illustrative embodiments. A first portion of a distributed application executes using a first processor and a second portion using a second processor in a distributed computing environment. After a failure of operation of the first portion, the first portion is restored to a checkpoint. A first part of the first portion is distributed to a third processor and a second part to a fourth processor. A computation of the first portion is performed using the first and the second parts in parallel. A first message is computed in the first portion and sent to the second portion, the message having been initially computed after a time of the checkpoint. A second message is replayed from the second portion without computing the second message in the second portion.
申请公布号 US2011296241(A1) 申请公布日期 2011.12.01
申请号 US20100788990 申请日期 2010.05.27
申请人 ELNOZAHY ELMOOTAZBELLAH NABIL;INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 ELNOZAHY ELMOOTAZBELLAH NABIL
分类号 G06F11/14 主分类号 G06F11/14
代理机构 代理人
主权项
地址