发明名称 Circumventing Load Imbalance in Parallel Simulations Caused by Faulty Hardware Nodes
摘要 The present disclosure describes methods, systems, and computer program products for circumventing parallel processing load imbalance. One computer-implemented method includes generating a library function for a plurality of parallel-processing nodes, receiving timing statistics from each of the plurality of parallel-processing nodes, the timing statistics generated by executing the library function on each parallel-processing node, determining that a faulty parallel-processing node exists, signaling a simulator to checkpoint and stop a simulation executing on the parallel processing nodes, and removing the faulty parallel-processing node from parallel processing nodes available to execute the simulation.
申请公布号 US2015227442(A1) 申请公布日期 2015.08.13
申请号 US201414178108 申请日期 2014.02.11
申请人 Saudi Arabian Oil Company 发明人 Baddourah Majdi A.;Hayder M. Ehtesham
分类号 G06F11/20 主分类号 G06F11/20
代理机构 代理人
主权项 1. A computer-implemented method comprising: generating a library function for a plurality of parallel-processing nodes; receiving timing statistics from each of the plurality of parallel-processing nodes, the timing statistics generated by executing the library function on each parallel-processing node; determining that a faulty parallel-processing node exists; signaling a simulator to checkpoint and stop a simulation executing on the parallel processing nodes; and removing the faulty parallel-processing node from parallel processing nodes available to execute the simulation.
地址 Dhahran SA