摘要 |
PROBLEM TO BE SOLVED: To improve the reliability and availability of a multiprocessor system, wherein cells including memories and more than one processor are mutually connected by an interconnecting network by preventing a fault from being propagated if the fault occurs to one cell in the operation of the multiprocessor system. SOLUTION: A cell 400 which detects a fault when accessing a memory 300 sends a fault report to a service processor 600. The service processor 600 broadcasts the received fault report to all cells 40 by returning it as a command temporarily stopping cell operation through hardware to instantaneously stop all the cells 400 form operating. Then the service processor 600 gathers information needed for fault analysis from the respective cells 400 to analyze the fault, disconnects a suspicious cell logically from the system and reconstitutes the system, and resets the temporary stop state of the respective cells to carry on the operation of the system.
|