发明名称 |
Fault-tolerant computer system |
摘要 |
A system and method for providing a fault-tolerant basis to execute instructions is disclosed. The system comprises an error detector, a rewriting module, a recovery engine, a fault locator and a fallback programming module. The error detector detects a first error in the execution of an instruction in a faulty stage unit of a first pipeline unit. The rewriting module rewrites the instruction to form a rewritten instruction responsive to detecting the first error. The recovery engine executes the rewritten instruction in the first pipeline unit. The error detector determines if a second error occurs in the execution of the rewritten instruction. Responsive to detecting the second error, the recovery engine selects a substitute stage unit for the faulty stage unit from a second pipeline unit. The fault locator locates a faulty component for the faulty stage unit. The fallback programming module establishes a fallback unit for the faulty component. |
申请公布号 |
US8898516(B2) |
申请公布日期 |
2014.11.25 |
申请号 |
US201113316314 |
申请日期 |
2011.12.09 |
申请人 |
Toyota Jidosha Kabushiki Kaisha |
发明人 |
Honda Makoto;Hirano Kanji |
分类号 |
G06F11/00 |
主分类号 |
G06F11/00 |
代理机构 |
Patent Law Works LLP |
代理人 |
Patent Law Works LLP |
主权项 |
1. A method for providing a fault-tolerant basis to execute instructions, the method comprising:
detecting a first error in the execution of an instruction in a faulty stage unit of a first pipeline unit in a first core; rewriting the instruction to form a rewritten instruction in response to the detection of the first error; executing the rewritten instruction in the first pipeline unit; determining if a second error occurs in the execution of the rewritten instruction; responsive to determining the occurrence of the second error,
selecting a first substitute stage unit for the faulty stage unit from a second pipeline unit in the first core, andre-executing the instruction in the first substitute stage unit of the second pipeline unit and non-faulty stage units of the first pipeline unit in the first core; determining whether a third error occurs in the execution of the instruction in the first substitute stage unit of the second pipeline unit and the non-faulty stage units of the first pipeline unit; responsive to detecting the third error, re-executing the instruction in a second substitute stage unit of a third pipeline unit in a second core; locating a faulty component for the faulty stage unit; and establishing a fallback unit for the faulty component. |
地址 |
Toyota-shi, Aichi-ken JP |