发明名称 MINIMIZING MICRO-INTERRUPTIONS IN HIGH-PERFORMANCE COMPUTING
摘要 Data storage systems and methods for storing data in computing nodes of a super computer or compute cluster are described herein. The super computer storage may be coupled with a primary storage system. In addition to a CPU and memory, non-volatile memory is included with the computing nodes as local storage. The super computer includes a plurality of computing groups, each including a plurality of computing nodes. There is one burst buffer fabric per group and one input/output node per group. When data bursts occur, data may be stored by a first computing node on the local storage of a second computing node in the computing group through the burst buffer fabric without interrupting the CPU in the second computing node. Further, the local storage of other computing nodes may be used to store redundant copies of data from a first computing node to make the super computer data resilient.
申请公布号 US2014337557(A1) 申请公布日期 2014.11.13
申请号 US201414274391 申请日期 2014.05.09
申请人 DataDirect Networks, Inc. 发明人 Nowoczynski Paul;Vildibill Michael;Cope Jason;Uppu Pavan
分类号 G06F13/28 主分类号 G06F13/28
代理机构 代理人
主权项 1. A data storage method comprising: a CPU of a first computing node of a super computer issuing a data write request; evaluating the availability of local storage in the first computing node; evaluating the availability of local storage in at least one other computing node of a plurality of computing nodes configured as a computing group in which the first computing node is a member, including querying an input/output node for the computing group to obtain an identifier of the at least one other computing node in the same computing group with available local storage; evaluating storage policies in view of the evaluating the availability of local storage in the first computing node and the evaluating the availability of local storage in the at least one other computing node in the computing group; writing data to local storage of at least one of the other computing nodes in the computing group according to the storage policies and the availability local storage both in the first computing node and in other computing nodes, including writing data from the first computing node to the local storage of the at least one other computing node through a burst buffer fabric, wherein the burst buffer fabric conforms to a storage device access standard.
地址 Chatsworth CA US