发明名称 Identifying nodes already storing indicated input data to perform distributed execution of an indicated program in a node cluster
摘要 Techniques are described for managing execution of programs, such as for distributed execution of a program on multiple computing nodes. In some situations, the techniques include selecting a cluster of computing nodes to use for executing a program based at least in part on data to be used during the program execution. For example, the computing node selection for a particular program may be performed so as to attempt to identify and use computing nodes that already locally store some or all of the input data that will be used by those computing nodes as part of the executing of that program on those nodes. Such techniques may provide benefits in a variety of situations, including when the size of input datasets to be used by a program are large, and the transferring of data to and/or from computing nodes may impose large delays and/or monetary costs.
申请公布号 US9276987(B1) 申请公布日期 2016.03.01
申请号 US201313794220 申请日期 2013.03.11
申请人 Amazon Technologies, Inc. 发明人 Sirota Peter;Khanna Richendra
分类号 G06F9/46;G06F15/16;G06F15/173;H04L29/08;G06F9/50 主分类号 G06F9/46
代理机构 Seed IP Law Group PLLC 代理人 Seed IP Law Group PLLC
主权项 1. A computer-implemented method comprising: receiving, by one or more computing systems configured to provide a program execution service that executes programs for multiple users by using a plurality of computing nodes provided by the program execution service, configuration information regarding using indicated input data as part of executing an indicated program for a first user of the multiple users in a distributed manner on a computing node cluster; selecting, by the one or more configured computing systems, multiple computing nodes from the plurality for the computing node cluster based at least in part on the selected multiple computing nodes each being identified as locally storing at least some of the indicated input data prior to the receiving of the configuration information; determining, by the one or more configured computing systems and based on at least one of the selected multiple computing nodes being currently unavailable to perform the executing of the indicated program due to executing one or more other programs for one or more other users and based on the executing of the indicated program being determined to have a higher priority than the executing of the one or more other programs, to terminate the executing of the one or more other programs on the at least one selected computing node to enable use of the at least one selected computing node in the executing of the indicated program; and initiating, by the one or more configured computing systems and based at least in part on the terminating of the executing of the one or more other programs on the at least one selected computing node, the executing of the indicated program on the selected multiple computing nodes using the locally stored at least some indicated input data.
地址 Reno NV US