发明名称 Link Failure Detection in a Parallel Computer
摘要 Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.
申请公布号 US2009037773(A1) 申请公布日期 2009.02.05
申请号 US20070832940 申请日期 2007.08.02
申请人 ARCHER CHARLES J;BLOCKSOME MICHAEL A;MEGERIAN MARK G;SMITH BRIAN E 发明人 ARCHER CHARLES J.;BLOCKSOME MICHAEL A.;MEGERIAN MARK G.;SMITH BRIAN E.
分类号 G06F11/00 主分类号 G06F11/00
代理机构 代理人
主权项
地址