摘要 |
A method of and an apparatus for distinguishing between transient and solid errors within a single-error-correcting semiconductor memory storage unit (MSU) comprised of a plurality of large scale integrated (LSI) bit planes and for notifying the associated data processing system of required maintenance action. The method utilizes an error logging store (ELS) that is comprised of a plurality of memory error registers one for each separately associated word group within the MSU. Each memory error register contains storage for: (1) the Error Correction Code (ECC) defined, failing bit position; (2) the single bit error counter; (3) the multiple single bit error tag; and (4) the multiple bit error tag. Upon detection of an error within a word group, the associated memory error register is accessed to determine the history of previously detected errors within that word group. The central processing unit (CPU) is notified by a priority interrupt of the error status of that word group if: (1) the number of consecutive errors within a word group at the same bit position reaches a set threshold indicating the high probability of a solid single bit error; or (2) the error detected is in a different bit position from that previously identified as a solid single bit error indicating the high probability of a future uncorrectable multiple-bit error. This method and apparatus notifies the CPU of the likelihood of imminent uncorrectable errors and maintains a history of the error indications that lead to that conclusion. |