发明名称 Congestion detection in a network interconnect
摘要 A method and system for detecting congestion in a network of nodes, abating the network congestion, and identifying the cause of the network congestion is provided. A congestion detection system may comprise a detection system, an abatement system, and a causation system. The detection system monitors the performance of network components such as the network interface controllers and tiles of routers to determine whether the network is congested such that a delay in delivering packets becomes unacceptable. Upon detecting that the network is congested, an abatement system abates the congestion by limiting the rate at which packets are injected into the network from the nodes. Upon detecting that the network is congested, a causation system may identify the job that is executing on a node that is the cause of the network congestion.
申请公布号 US9391899(B2) 申请公布日期 2016.07.12
申请号 US201414570722 申请日期 2014.12.15
申请人 Cray Inc. 发明人 Kaplan Laurence S.;Froese Edwin Lloyd;Johns Christopher Brian;Kelly Matthew Paul;Godfrey Aaron Forest;Shields Brent Thomas
分类号 H04L12/801;G06F15/173;H04L12/26;H04L29/08;H04L12/24 主分类号 H04L12/801
代理机构 Perkins Coie LLP 代理人 Perkins Coie LLP
主权项 1. A computer-readable storage medium that is not a transitory propagating signal storing computer-executable instructions for controlling a computer system to detect congestion in a network of nodes connected via connection devices, the computer-executable instructions comprising instructions of: a component that collects performance measurements for the connection devices indicating whether data is delayed at the connection devices, the performance measurements indicating number of connection device periods in which data was forwarded by a connection device during a measurement period and number of connection device periods during the measurement period in which data was delayed at the connection device; a component that determines based on the performance measurements of the connection devices whether a connection device satisfies a stall criterion; a component that determines based on the connection devices that satisfy the stall criterion whether the network satisfies a network congestion criterion; and a component that indicates the network is congested when the network stall criterion is satisfied.
地址 Seattle WA US
您可能感兴趣的专利