发明名称 Recovering from failures without impact on data traffic in a shared bus architecture
摘要 Methods of detecting and recovering from communication failures within an operating network switching device that is switching packets in a communication network, and associated structures. The communication failures addressed involve communications between the packet processors and a host CPU over a shared communications bus, e.g., PCI bus. The affected packet processor(s)—which may be all or a subset of the packet processors of the network switch—may be recovered without affecting hardware packet forwarding through the affected packet processors. This maximizes the up time of the network switching device. Other packet processor(s), if any, of the network switching device, which are not affected by the communication failure, may continue their normal packet forwarding, i.e., hardware forwarding that does not involve communications with the host CPU as well as forwarding or other operations that do involve communications with the host CPU.
申请公布号 US9030943(B2) 申请公布日期 2015.05.12
申请号 US201213548116 申请日期 2012.07.12
申请人 Foundry Networks, LLC 发明人 Suresh Ravindran;Balasubramanian Adoor V.
分类号 G06F11/00;H04L12/46;G06F11/07 主分类号 G06F11/00
代理机构 Kilpatrick Townsend & Stockton LLP 代理人 Kilpatrick Townsend & Stockton LLP
主权项 1. A method in a network device, the method comprising: storing, in a memory associated with a host processor of the network device, a set of data structures used for transferring data, on a shared bus, between the host processor and a plurality of packet processors of the network device, each packet processor in the plurality of packet processors configured to forward, from the network device, one or more packets received by the network device; detecting, by the host processor, an error condition indicative of a communication error between the host processor and a first packet processor from the plurality of packet processors; in response to detection of the error condition: identifying, by the host processor, from the plurality of packet processors, the first packet processor affected by the error condition; andperforming, by the host processor, a set of recovery actions for recovering from the error condition, the set of recovery actions including disabling communication between the host processor and the first packet processor; and while the set of recovery actions is being performed, communicating data on the shared bus, between the host processor and at least one packet processor from the plurality of packet processors other than the first packet processor and forwarding, by the first packet processor at least one packet received by the network device using forwarding information programmed prior to the host processor detecting the error condition.
地址 San Jose CA US