发明名称 FAILURE RECOVERY FOR TRANSPLANTING ALGORITHMS FROM CLUSTER TO CLOUD
摘要 A method (400) of providing failure recovery capabilities to a cloud environment (10) for scientific HPC applications. An HPC application with MPI implementation extends the class of MPI programs to embed the HPC application with various degrees of fault tolerance. An MPI fault tolerance mechanism realizes a recover-and-continue solution. If an error occurs, only failed processes re-spawn, the remaining living processes remain in their original processors/nodes (12, 14, 16), and system recovery costs are thus minimized.
申请公布号 WO2015081318(A1) 申请公布日期 2015.06.04
申请号 WO2014US67815 申请日期 2014.11.28
申请人 FUTUREWEI TECHNOLOGIES, INC.;HUAWEI TECHNOLOGIES CO., LTD. 发明人 REN, DA QI;WEI, ZHULIN
分类号 G06F11/07 主分类号 G06F11/07
代理机构 代理人
主权项
地址