发明名称 SELF MONITORING AND SELF REPAIRING ECC
摘要 Exemplary embodiments of the present invention disclose a method and system for monitoring a first Error Correcting Code (ECC) device for failure and replacing the first ECC device with a second ECC device if the first ECC device begins to fail or fails. In a step, an exemplary embodiment detects that a specified number of correctable errors is exceeded. In another step, an exemplary embodiment detects the occurrence of an uncorrectable error. In another step, an exemplary embodiment performs a loopback test on an ECC device if a specified number of correctable errors is exceeded or if an uncorrectable error occurs. In another step, an exemplary embodiment replaces an ECC device that fails the loopback test with an ECC device that passes a loopback test.
申请公布号 US2014250340(A1) 申请公布日期 2014.09.04
申请号 US201313781807 申请日期 2013.03.01
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 Cordero Edgar R.;Dell Timothy J.;Henderson Joab D.;Sabrowski Jeffrey A.;Saetow Anuwat;Sethuraman Saravanan
分类号 G06F11/28 主分类号 G06F11/28
代理机构 代理人
主权项 1. A system for monitoring a first Error Correcting Code (ECC) module for failure and replacing a first ECC module with a second ECC module if the first ECC module fails, the system comprising: an ECC system comprised of the first ECC module and the second ECC module that independently perform ECC; logic to count correctable errors detected by the first ECC module and by the second ECC module in data that is read from a memory; logic to detect an uncorrectable error by the first ECC module and by the second ECC module in data that is read from memory; logic to perform a first loopback test that replaces an input to the first ECC module with an output from the first ECC module; logic to perform a second loopback test that replaces an input to the second ECC module with an output from the second ECC module; logic to detect a failed first ECC module with the first loopback test; logic to detect a failed second ECC module with the second loopback test; logic to replace the first ECC module with the second ECC module in the ECC system; wherein the system is operable to: detect that a specified number of correctable errors is exceeded;detect an occurrence of an uncorrectable error;perform the first loopback test on the first ECC module if the specified number of correctable errors is exceeded or if an uncorrectable error is detected in the first ECC module;perform the second loopback test on the second ECC module if the specified number of correctable errors is exceeded or if an uncorrectable error is detected in the second ECC module; andreplace the first ECC module that fails the first loopback test with the second ECC module that passes the second loopback test.
地址 Armonk NY US