发明名称 Heterogeneous recovery in a redundant memory system
摘要 Providing heterogeneous recovery in a redundant memory system that includes a memory controller, a plurality of memory channels in communication with the memory controller, an error detection code mechanism configured for detecting a failing memory channel, and an error recovery mechanism. The error recovery mechanism is configured for receiving notification of the failing memory channel, for performing a recovery operation on the failing memory channel while other memory channels are performing normal system operations, for bringing the recovered channel back into operational mode with the other memory channels for store operations, for continuing to mark the recovered channel to guard against stale data, for removing any stale data after the recovery operation is complete, and for removing the mark on the recovered channel to allow the normal system operations with all of the memory channels, the removing based on the removing any stale data being complete.
申请公布号 US8775858(B2) 申请公布日期 2014.07.08
申请号 US201313793363 申请日期 2013.03.11
申请人 International Business Machines Corporation 发明人 Gower Kevin C.;Lastras-Montano Luis A.;Meaney Patrick J.;Papazova Vesselina K.;Stephens Eldee
分类号 G06F11/00 主分类号 G06F11/00
代理机构 Cantor Colburn LLP 代理人 Cantor Colburn LLP ;Campbell John
主权项 1. A computer implemented method for detecting a failing memory channel and performing a recovery operation, the method comprising: receiving a notification that a memory channel has failed, the memory channel one of a plurality of memory channels in a memory system; performing the recovery operation on the failing memory channel while other memory channels are performing normal system operations, the recovery operation comprising performing clock calibration on the failing memory channel, and performing data calibration on the plurality of memory channels based on completion of the clock calibration on the failing memory channel; removing any stale data based on the recovery operation being completed, the removing while the other memory channels are performing normal system operations; and continuing normal system operations with all of the memory channels based on the removing any stale data being completed.
地址 Armonk NY US