学位论文详细信息
Stela: on-demand elasticity in distributed data stream processing systems
Distributed Systems;Stream Processing Systems
Xu, Le ; Gupta ; Indranil
关键词: Distributed Systems;    Stream Processing Systems;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/88083/XU-THESIS-2015.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Big data is characterized by volume and velocity [24], and recently several real- time stream processing systems have emerged to combat this challenge. These systems process streams of data in real time and computational results. However, current popular data stream processing systems lack the ability to scale out and scale in (i.e., increase or decrease the number of machines or VMs allocated to the application) efficiently and unintrusively when requested by the user on demand. In order to scale out/in, a critical problem that needs to be solved is to determine which operator(s) of the stream processing application need to be given more resources or taken resources away from, in order to maximize the application throughput. We do so by presenting a novel metric called "Expected Throughput Percentage" (ETP). ETP takes into account not only congested elements of the stream processing application but also their effect on downstream elements and on the overall application throughput.Next, we show how our new system, called Stela (STream processing ELAsticity), incorporates ETP in its scheduling strategy. Stela enables scale out and scale in operations on demand, and achieves the twin goals of optimizing post-scaling throughput and minimizing interference to throughput during the scaling out/in. We have integrated the implementation of Stela into Apache Storm [27], a popular data stream processing system.We conducted experiments on Stela using a set of micro benchmark topologies as well as two topologies from Yahoo! Inc. Our experiment results shows Stela achieves 5% to 120% higher post scale throughput comparing to default Storm scheduler performing scale out operations, and 40% to 500% of throughput improvement comparing to the default scheduler during scale in stage. This work is a joint project with Master student Boyang Peng [1].

【 预 览 】
附件列表
Files Size Format View
Stela: on-demand elasticity in distributed data stream processing systems 3225KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:3次