发明名称 Parallel computer, and job information acquisition method for parallel computer
摘要 A parallel computer includes a plurality of calculation nodes and a management node. A calculation node includes a retention control unit that retains job information in a retention unit in association with an identification number, and the management node includes a retention control unit that retains the job information in a retention unit, retains, as a snapshot, job information of the same identification number in a case where the job information of the same identification number about a calculation node is detected in the retention unit. The retention unit of the calculation node includes a retention region enabling retention of job information corresponding to a plurality of periods, and the retention unit of the management node includes a retention region enabling retention of the job information corresponding to the plurality of periods with respect to each of the calculation nodes.
申请公布号 US9336044(B2) 申请公布日期 2016.05.10
申请号 US201313778494 申请日期 2013.02.27
申请人 FUJITSU LIMITED 发明人 Takeshita Hiroto
分类号 G06F9/46;G06F11/34 主分类号 G06F9/46
代理机构 Staas & Halsey LLP 代理人 Staas & Halsey LLP
主权项 1. A parallel computer comprising: a plurality of calculation nodes that execute a calculation job distributively in parallel; and a management node that manages the plurality of calculation nodes, wherein one of the calculation nodes comprises: a processor configured to execute a first process including: acquiring job information about a calculation job handled by the one of the calculation nodes according to a period timing common to the calculation nodes; retaining the job information in a retention unit of the one of the calculation nodes in association with an identification number identifying the period timing at which the job information is acquired at the acquiring, and clearing all the job information retained in the retention unit when a clear request is received from the management node; and transmitting, when a transmission request for the job information about a designated identification number is received from the management node, the job information to the management node in a case where the job information exists in the retention unit, and transmitting other job information about an identification number just before the designated identification number to the management node in a case where the job information does not exist in the retention unit and the other job information exists in the retention unit, the management node comprises: a processor configured to execute a second process including: retaining the job information in a retention unit of the management node when the job information is received from each of the calculation nodes according to the transmission request, retains, as a snapshot, job information including the same identification number as the identification number of other job information in a case where the job information is detected in the retention unit, and clearing job information other than the job information retained in the retention unit of the management node; and transmitting the clear request to each of the calculation nodes when the job information of the same identification number is retained as a snapshot, the retention unit of the calculation node comprises a retention region enabling retention of job information corresponding to a plurality of periods, and the retention unit of the management node comprises a retention region enabling retention of the job information corresponding to the plurality of periods with respect to each of the calculation nodes, wherein the retention unit of each of the calculation nodes is provided with a first retention region and a second retention region as retention regions for retaining the job information corresponding to two time belts in order to absorb a gap corresponding to one time belt.
地址 Kawasaki JP