期刊论文详细信息
IEEE Access
Pipeline-Based Linear Scheduling of Big Data Streams in the Cloud
Manos Roumeliotis1  Stavros Souravlas1  Nicoleta Tantalaki1  Stefanos Katsavounis2 
[1] Department of Applied Informatics, University of Macedonia, Thessaloniki, Greece;Department of Production and Management Engineering, Democritus University of Thrace, Xanthi, Greece;
关键词: Stream processing;    scheduling;    big data;    pipelines;    distributed systems;   
DOI  :  10.1109/ACCESS.2020.3004612
来源: DOAJ
【 摘 要 】

Nowadays, there is an accelerating need to efficiently and timely handle large amounts of data that arrives continuously. Streams of big data led to the emergence of several Distributed Stream Processing Systems (DSPS) that assign processing tasks to the available resources (dynamically or not) and route streaming data between them. Efficient scheduling of processing tasks can reduce application latencies and eliminate network congestions. However, the available DSPSs' in-built scheduling techniques are far from optimal. In this work, we extend our previous work, where we proposed a linear scheme for the task allocation and scheduling problem. Our scheme takes advantage of pipelines to handle efficiently applications, where there is need for heavy communication (all-to-all) between tasks assigned to pairs of components. In this work, we prove that our scheme is periodic, we provide a communication refinement algorithm and a mechanism to handle many-to-one assignments efficiently. For concreteness, our work is illustrated based on Apache Storm semantics. The performance evaluation depicts that our algorithm achieves load balance and constraints the required buffer space. For throughput testing, we compared our work to the default Storm scheduler, as well as to R-Storm. Our scheme was found to outperform both the other strategies and achieved an average of 25%-40% improvement compared to Storm's default scheduler under different scenarios, mainly as a result of reduced buffering (≈ 45% less memory). Compared to R-storm, the results indicate an average of 35%-45% improvement.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:7次