发明名称 METHOD AND APPARATUS FOR ESTIMATING A COMPLETION TIME FOR MAPREDUCE JOBS
摘要 A method, non-transitory computer readable medium, and apparatus for estimating a completion time for a MapReduce job are disclosed. For example, the method builds a general MapReduce performance model, computes one or more performance characteristics of each one of one or more benchmark workloads, computes one or more performance characteristics of the MapReduce job in the known processing system, selects a subset of the one or more benchmark workloads that have similar performance characteristics as the one or more performance characteristics of the MapReduce job, targets a cluster of processing nodes in a distributed processing system, computes one or more performance characteristics of the subset of the one or more benchmark workloads in the cluster of processing nodes and estimates the completion time for the MapReduce job.
申请公布号 US2015178419(A1) 申请公布日期 2015.06.25
申请号 US201314135114 申请日期 2013.12.19
申请人 Xerox Corporation 发明人 LI JACK YANXIN;Jung Gueyoung
分类号 G06F17/50 主分类号 G06F17/50
代理机构 代理人
主权项 1. A method for estimating a completion time for a MapReduce job, comprising: building, by a processor, a general MapReduce performance model; computing, by the processor, one or more performance characteristics of each one of one or more benchmark workloads in accordance with the general MapReduce performance model in a known processing system; computing, by the processor, one or more performance characteristics of the MapReduce job in accordance with the general MapReduce performance model in the known processing system; selecting, by the processor, a subset of the one or more benchmark workloads that have similar performance characteristics as the one or more performance characteristics of the MapReduce job; targeting, by the processor, a cluster of processing nodes in a distributed processing system having one or more unknown hardware configurations; computing, by the processor, one or more performance characteristics of the subset of the one or more benchmark workloads in the cluster of processing nodes; and estimating, by the processor, the completion time for the MapReduce job based upon a comparative analysis of the one or more performance characteristics of the subset of the one or more benchmark workloads in the cluster of processing nodes and the one or more performance characteristics of the subset of the one or more benchmark workloads in the known processing system.
地址 Norwalk CT US