科技报告详细信息
SLO-Driven Right-Sizing and Resource Provisioning of MapReduce Jobs
Verma, Abhishek ; Cherkasova, Ludmila ; Campbell, Roy H.
HP Development Company
关键词: MapReduce;    Hadoop;    performance models;    completion time prediction;    resource allocation;   
RP-ID  :  HPL-2011-126
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

There is an increasing number of MapReduce applications, e.g., personalized advertising, spam detection, real-time event log analysis, that need to be completed within a given time window. Currently, there is a lack of performance models and workload analysis tools available to system administrators for automated performance management of such MapReduce jobs. In this work, we outline a novel framework for SLO-driven resource provisioning and sizing of MapReduce jobs. First, we propose an automated profiling tool that extracts a compact job profile from the past application run(s) or by executing it on a smaller data set. Then, by applying a linear regression technique, we derive scaling factors to accurately project the application performance when processing a larger dataset. The job profile (with scaling factors) forms the basis of a MapReduce performance model that computes the lower and upper bounds on the job completion time. Finally, we provide a fast and efficient capacity planning model that for a MapReduce job with timing requirements generates a set of resource provisioning options. We validate the accuracy of our models by executing a set of realistic applications on the 66-node Hadoop cluster.

【 预 览 】
附件列表
Files Size Format View
RO201804100002857LZ 166KB PDF download
  文献评价指标  
  下载次数:26次 浏览次数:48次