摘要 |
Metric-based verification tool for large-scale models, e.g. weather simulations, ported between High Performance Computing (HPC) systems, comprising: data module 20 generating a host and a target ensemble of solutions after executing a numerical model program by repeatedly varying execution parameters in host and target environments, respectively; analytics engine 30 computing a mathematical distance between the host and target ensemble; and decision module 40 using the distance to decide whether porting has created an error. The host ensemble is a perturbed architecture ensemble (PAE) obtained by varying software/hardware parameters such as task/data parallelism, programming/runtime environment settings, operating system release, middleware version. The target ensemble is an initial condition ensemble (ICE) obtained by varying initial condition of the model. Varying the execution parameters randomly spreads the solutions to form distributed PAE and ICE whose Bhattachryya distances are used for decision making. Presence of systematic porting errors affecting the solutions is identified. |