摘要 |
The disclosed embodiments relate to a memory system that facilitates probabilistic error correction for a failed memory component with partial-component sparing. During operation, the memory system accesses blocks of data, wherein each block of data includes an array of bits logically organized into R rows and C columns. The C columns include (1) a row-checkbit column containing row-parity bits for each of the R rows, (2) an inner-checkbit column containing X=R−S inner checkbits and S spare bits, and (3) C-2 data-bit columns containing data bits. Moreover, each column is stored in a different memory component, and the checkbits are generated from the data bits to provide guaranteed detection and probabilistic correction for a failed memory component. When the memory system determines that a memory component has failed, the memory system examines the pattern of errors associated with the failed component to determine if the failure affects a partial component associated with S or fewer bits of an associated failed column in each block of data. If so, the memory system corrects and remaps data bits from the failed partial component to the S spare data bits in the inner-checkbit column. Next, after the correcting and remapping operations are complete, the memory system resumes operation with guaranteed detection and probabilistic correction of a subsequent failed memory component. |