会议论文详细信息
21st International Conference on Computing in High Energy and Nuclear Physics
Scalable and fail-safe deployment of the ATLAS Distributed Data Management system Rucio
物理学;计算机科学
Lassnig, M.^1 ; Vigne, R.^2 ; Beermann, T.^1 ; Barisits, M.^1 ; Garonne, V.^3 ; Serfon, C.^1
ATLAS Data Processing, Physics Department, CERN, Genève
1211-23, Switzerland^1
Institute for Astro- and Particle Physics, University of Innsbruck, Innsbruck
6020, Austria^2
Department of Physics, University of Oslo, Oslo
0316, Norway^3
关键词: Administrative controls;    Distributed data;    Distributed data managements;    Mitigation strategy;    Monitoring strategy;    Real time monitoring;    Service migration;    Software upgrades;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/664/6/062027/pdf
DOI  :  10.1088/1742-6596/664/6/062027
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

This contribution details the deployment of Rucio, the ATLAS Distributed Data Management system. The main complication is that Rucio interacts with a wide variety of external services, and connects globally distributed data centres under different technological and administrative control, at an unprecedented data volume. It is therefore not possible to create a duplicate instance of Rucio for testing or integration. Every software upgrade or configuration change is thus potentially disruptive and requires fail-safe software and automatic error recovery. Rucio uses a three-layer scaling and mitigation strategy based on quasi-realtime monitoring. This strategy mainly employs independent stateless services, automatic failover, and service migration. The technologies used for deployment and mitigation include OpenStack, Puppet, Graphite, HAProxy and Apache. In this contribution, the interplay between these components, their deployment, software mitigation, and the monitoring strategy are discussed.

【 预 览 】
附件列表
Files Size Format View
Scalable and fail-safe deployment of the ATLAS Distributed Data Management system Rucio 1956KB PDF download
  文献评价指标  
  下载次数:12次 浏览次数:18次