期刊论文详细信息
Computer Science and Information Systems
Throughput prediction based on ExtraTree for stream processing tasks
article
Zheng Chu1  Jiong Yu1  Askar Hamdulla1 
[1] School of Information Science and Engineering, Xinjiang University
关键词: streaming data;    stream processing tasks;    performance prediction;    ensemble learning;    ExtraTree;   
DOI  :  10.2298/CSIS200131031C
学科分类:土木及结构工程学
来源: Computer Science and Information Systems
PDF
【 摘 要 】

In the era of big data, as the amount of streaming data continues to increase, stream processing tasks (SPTs) face serious challenges in real-time processing scenarios with low latency and high throughput. However, much of the current literature on the performance of SPTs pays attention to the reactive approach, which cannot well avoid the problem of system crashes due to the inherent performance volatility. In this paper, a novel throughput prediction method based on ExtraTree for SPTs is presented to address these challenges. A volatility detection algorithm was proposed to obtain the reasonable metric values after the performance volatility of SPTs was studied. Moreover, a selection algorithm of regression function was proposed to output the performance values of SPTs under a relative stead state. Furthermore, a ExtraTree-based algorithm was proposed to predict the throughput of SPTs. The experimental results from two open-source benchmarks running on Apache Flink, a popular stream processing system (SPS), indicated that the average of the accuracy and efficiency of the proposed method could achieve 90.535% and 0.835 s/10,000 samples, which proved the effectiveness of the proposed method on the task of predicting the throughput of SPTs.

【 授权许可】

CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO202307150003229ZK.pdf 2486KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:1次