IEEE Access | 卷:8 |
DSPBench: A Suite of Benchmark Applications for Distributed Data Stream Processing Systems | |
Gabriele Mencagli1  Maycon Viana Bordin2  Claudio F. R. Geyer2  Dalvan Griebler3  Luiz Gustavo L. Fernandes3  | |
[1] Department of Computer Science, University of Pisa, Pisa, Italy; | |
[2] Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil; | |
[3] School of Technology, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Brazil; | |
关键词: Data stream processing; big data; benchmarking; apache storm; spark streaming; | |
DOI : 10.1109/ACCESS.2020.3043948 | |
来源: DOAJ |
【 摘 要 】
Systems enabling the continuous processing of large data streams have recently attracted the attention of the scientific community and industrial stakeholders. Data Stream Processing Systems (DSPSs) are complex and powerful frameworks able to ease the development of streaming applications in distributed computing environments like clusters and clouds. Several systems of this kind have been released and currently maintained as open source projects, like Apache Storm and Spark Streaming. Some benchmark applications have often been used by the scientific community to test and evaluate new techniques to improve the performance and usability of DSPSs. However, the existing benchmark suites lack of representative workloads coming from the wide set of application domains that can leverage the benefits offered by the stream processing paradigm in terms of near real-time performance. The goal of this article is to present a new benchmark suite composed of 15 applications coming from areas like Finance, Telecommunications, Sensor Networks, Social Networks and others. This article describes in detail the nature of these applications, their full workload characterization in terms of selectivity, processing cost, input size and overall memory occupation. In addition, it exemplifies the usefulness of our benchmark suite to compare real DSPSs by selecting Apache Storm and Spark Streaming for this analysis.
【 授权许可】
Unknown