摘要 |
A method, apparatus and computer program product are provided for implementing uncorrectable error isolation in a computer system while the system continues to run. A memory controller performs data fetching from a system memory, capturing error information, and responsive to detecting an uncorrectable error, generates a predefined attention to a service processor. The service processor utilizing a processor runtime diagnostic (PRD) program, reads the captured error data and identifies a memory extent with the uncorrectable error. Then the memory controller performs accelerated scrubbing of the identified memory extent with the uncorrectable error, capturing error information and responsive to a scrub correctable error threshold being exceeded, sends a predefined scrub threshold exceeded attention to the service processor. The service processor reads the captured error data and identifies a failed memory chip.
|