会议论文详细信息
17th International Workshop on Advanced Computing and Analysis Techniques in Physics Research
A scalable architecture for online anomaly detection of WLCG batch jobs
物理学;计算机科学
Kuehn, E.^1 ; Fischer, M.^1 ; Giffels, M.^1 ; Jung, C.^1 ; Petzold, A.^1
Karlsruhe Institute of Technology, Steinbuch Centre for Computing, Hermann-von-Helmholtz-Platz 1, Eggenstein-Leopoldshafen
76344, Germany^1
关键词: Anomaly detection;    Computational costs;    Local information;    Misconfigurations;    Network communications;    Online anomaly detection;    Scalable architectures;    Superpeer networks;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/762/1/012002/pdf
DOI  :  10.1088/1742-6596/762/1/012002
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

For data centres it is increasingly important to monitor the network usage, and learn from network usage patterns. Especially configuration issues or misbehaving batch jobs preventing a smooth operation need to be detected as early as possible. At the GridKa data and computing centre we therefore operate a tool BPNetMon for monitoring traffic data and characteristics of WLCG batch jobs and pilots locally on different worker nodes. On the one hand local information itself are not sufficient to detect anomalies for several reasons, e.g. the underlying job distribution on a single worker node might change or there might be a local misconfiguration. On the other hand a centralised anomaly detection approach does not scale regarding network communication as well as computational costs. We therefore propose a scalable architecture based on concepts of a super-peer network.

【 预 览 】
附件列表
Files Size Format View
A scalable architecture for online anomaly detection of WLCG batch jobs 750KB PDF download
  文献评价指标  
  下载次数:26次 浏览次数:24次