发明名称 Optimized peer-to-peer file transfers on a multi-node computer system
摘要 A method and apparatus performs peer-to-peer file transfers on a High Performance Computing (HPC) cluster such as a Beowulf cluster. A peer-to-peer file tracker (PPFT) allows operating system, application and data files to be moved from a pre-loaded node to another node of the HPC cluster. A peer-to-peer (PTP) client is loaded into the nodes to facilitate PTP file transfers to reduce loading on networks, network switches and file servers to reduce the time needed to load the nodes with these files to increase overall efficiency of the multi-node computing system. The selection of the nodes participating in file transfers can be based on network topology, network utilization, job status and predicted network/computer utilization. This selection can be dynamic, changing during the file transfers as resource conditions change. The policies used to choose resources can be configured by an administrator.
申请公布号 US8856275(B2) 申请公布日期 2014.10.07
申请号 US201313787728 申请日期 2013.03.06
申请人 International Business Machines Corporation 发明人 Barsness Eric L.;Darrington David L.;Randles Amanda;Santosuosso John M.
分类号 G06F15/16;H04L29/08;G06F15/173 主分类号 G06F15/16
代理机构 Martin & Associates, LLC 代理人 Martin & Associates, LLC
主权项 1. A multi-node computer cluster comprising: a plurality of nodes that each comprise a processor and memory; a network with network switches connecting the plurality of nodes; a service node connected to the plurality of nodes, wherein the service node includes a resource manager that manages and monitors storage and network resources used by the system, and a scheduler that handles allocating and scheduling work and data placement on the compute nodes, wherein the resource manager and the resource scheduler provide resource attributes to dynamically adjust which nodes of the plurality of nodes participate in a peer-to-peer file transfer; a peer-to-peer file tracker in the service node that manages the plurality of nodes to accomplish a the peer-to-peer file transfer between at least two of the plurality of nodes over the network, wherein the peer-to-peer transfer copies an operating system kernel from a source node to a destination node on the cluster in a process of booting the destination node; a peer-to-peer client residing in the at least two of the plurality of nodes, where the peer-to-peer client manages data flow in the peer-to-peer file transfer, wherein the peer-to-peer client uses the resource attributes from the resource manager and from the scheduler received in a file tracker update sent from the peer-to-peer file tracker to dynamically adjust which nodes of the plurality of nodes participate in the peer-to-peer file transfer while the file transfer is in progress by restarting segments of the transfer from a new source node; policies for the resource attributes that are set by a system administrator to indicate how to manage the peer-to-peer file transfer; wherein the resource attributes include the following: network topology, network utilization, network switch loading, file server loading, job status and historical information related to the resource attributes; and wherein the peer-to-peer client dynamically adjusts the nodes participating in the peer-to-peer file transfer while the file transfer is in progress.
地址 Armonk NY US