发明名称 Transparently enforcing policies in hadoop-style processing infrastructures
摘要 Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.
申请公布号 US9253055(B2) 申请公布日期 2016.02.02
申请号 US201313886087 申请日期 2013.05.02
申请人 International Business Machines Corporation 发明人 Nelke Sebastian;Oberhofer Martin A.;Saillet Yannick;Seifert Jens
分类号 G06F15/173;H04L12/26;H04L12/24;G06F9/50 主分类号 G06F15/173
代理机构 Patterson & Sheridan, LLP 代理人 Patterson & Sheridan, LLP
主权项 1. A computer-implemented method, comprising: receiving, by a name node, a request to process a client workload on a subset of a plurality of data nodes in a distributed computing environment, wherein the name node stores a file system index reflecting files stored on the plurality of data nodes as part of a distributed file system of the distributed computing environment; retrieving, by the name node, a set of requirements for processing the client workload; analyzing, by the name node, the retrieved set of requirements and nodal data describing each of the data nodes, to select, a first data node of the plurality of data nodes as a delegation target to process at least a portion of the client workload, the first data node being selected upon determining: (i) the first data node has a level of resource utilization not exceeding a maximum delegation threshold, and (ii) the nodal data of the first data node satisfies a greater count of the set of requirements than the nodal data of a second data node of the plurality of data nodes, wherein the set of requirements and the nodal data are encrypted and stored on the name node; delegating the requested processing of the client workload to the delegation target; and updating the nodal data to reflect the delegation of the client workload to the delegation target, wherein each of the data nodes does not include nodal data identifying other data nodes.
地址 Armonk NY US