学位论文详细信息
Hadoop MapReduce Performance Enhancement Using In-Node Combiners
MapReduce;Hadoop;HDFS;Combiner;NoSQL;621
공과대학 전기·컴퓨터공학부 ;
University:서울대학교 대학원
关键词: MapReduce;    Hadoop;    HDFS;    Combiner;    NoSQL;    621;   
Others  :  http://s-space.snu.ac.kr/bitstream/10371/123168/1/000000026798.pdf
美国|英语
来源: Seoul National University Open Repository
PDF
【 摘 要 】

Overwhelming amount of data is being generated by various applications and devices in real-time. While advanced analysis of large dataset is in high demand, data sizes have surpassed capabilities of conventional software and hardware. Data-intensive analytics should be processed in tolerable elapsed time using commodity hardware. Hadoop framework efficiently distributes large datasets over multiple commodity servers and the MapReduce framework performs parallel computations. We discuss the I/O bottlenecks of Hadoop MapReduce framework and propose methods for enhancing I/O performance in common MapReduce jobs. A proven approach is to cache input data to maximize memory-locality of all map tasks. We introduce an approach to optimize I/O in the shuffle phase, the in-node combining design which extend the scope of the traditional combiner to a node level. The in-node combiner reduces the total number of emitted intermediate results and curtail network traffic between mappers and reducers.

【 预 览 】
附件列表
Files Size Format View
Hadoop MapReduce Performance Enhancement Using In-Node Combiners 737KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:6次