发明名称 System and method for performing message driven prefetching at the network interface
摘要 Each computing node of a distributed computing system may implement a hardware mechanism at the network interface for message driven prefetching of application data. For example, a parallel data-intensive application that employs function shipping may distribute respective portions of a large data set to main memory on multiple computing nodes. The application may send messages to one of the computing nodes referencing data that is stored locally on the node. For each received message, the network interface on the recipient node may extract the reference, initiate the prefetching of referenced data into a local cache (e.g., an LLC), and then store the message for subsequent interpretation and processing by a local processor core. When the processor core retrieves a stored message for processing, the referenced data may already be in the LLC, avoiding a CPU stall while retrieving it from memory. The hardware mechanism may be configured via software.
申请公布号 US9535842(B2) 申请公布日期 2017.01.03
申请号 US201414472105 申请日期 2014.08.28
申请人 Oracle International Corporation 发明人 Schwetman, Jr. Herbert D.;Zulfiqar Mohammad Arslan;Koka Pranay
分类号 G06F12/08;G06F13/16;H04L29/06;G06F12/10 主分类号 G06F12/08
代理机构 Meyertons, Hood, Kivlin, Kowert & Goetzel, P.C. 代理人 Kowert Robert C.;Meyertons, Hood, Kivlin, Kowert & Goetzel, P.C.
主权项 1. A system, comprising: a plurality of computing nodes, each of which comprises: one or more processor cores;a cache memory shared by the one or more processor cores;a cache prefetcher; a local system memory, wherein the local system memory comprises one or more data structures configured to store messages, and wherein the local system memory stores a portion of a large data set; and a network interface component comprising a plurality of registers; wherein the local system memories of the plurality of computing nodes collectively store program instructions that when executed on the one or more processor cores of the plurality of computing nodes implement an application that operates on the large data set; wherein the network interface component of a given one of the plurality of computing nodes is configured to: receive a message from another one of the plurality of computing nodes, wherein the message comprises a reference to data in the large data set that is stored in the local system memory of the given computing node;determine the physical location of the referenced data on the given computing node;communicate information indicating the physical location of the referenced data to the cache prefetcher on the given computing node; andstore the message in a data structure in the local system memory of the given computing node for subsequent processing by one of the one or more processor cores of the given computing node; and wherein the cache prefetcher of the given computing node is configured to: receive the information indicating the physical location of the referenced data from the network interface component; andprefetch the referenced data from the local system memory of the given computing node into the cache memory of the given computing node; and wherein the one of the one or more processor cores is configured to: retrieve the message from the data structure in the local system memory of the given computing node; andprocess the message, wherein to process the message, the one of the plurality of processor cores is configured to operate on the prefetched referenced data in the cache memory of the given computing node.
地址 Redwood City CA US