会议论文详细信息
20th International Conference on Computing in High Energy and Nuclear Physics
Transaction aware tape-infrastructure monitoring
物理学;计算机科学
Nikolaidis, Fotios ; Kruse, Daniele Francesco
关键词: Access control lists;    Centralized systems;    Generic interfaces;    Information visualization;    Infrastructure monitoring;    Monitoring system;    Performance evaluations;    Reliability requirements;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/513/3/032070/pdf
DOI  :  10.1088/1742-6596/513/3/032070
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

Administrating a large scale, multi protocol, hierarchical tape infrastructure like the CERN Advanced STORage manager (CASTOR)[2], which stores now 100 PB (with an increasing step of 25 PB per year), requires an adequate monitoring system for quick spotting of malfunctions, easier debugging and on demand report generation. The main challenges for such system are: to cope with CASTOR's log format diversity and its information scattered among several log files, the need for long term information archival, the strict reliability requirements and the group based GUI visualization. For this purpose, we have designed, developed and deployed a centralized system consisting of four independent layers: the Log Transfer layer for collecting log lines from all tape servers to a single aggregation server, the Data Mining layer for combining log data into transaction context, the Storage layer for archiving the resulting transactions and finally the Web UI layer for accessing the information. Having flexibility, extensibility and maintainability in mind, each layer is designed to work as a message broker for the next layer, providing a clean and generic interface while ensuring consistency, redundancy and ultimately fault tolerance. This system unifies information previously dispersed over several monitoring tools into a single user interface, using Splunk, which also allows us to provide information visualization based on access control lists (ACL). Since its deployment, it has been successfully used by CASTOR tape operators for quick overview of transactions, performance evaluation, malfunction detection and from managers for report generation.

【 预 览 】
附件列表
Files Size Format View
Transaction aware tape-infrastructure monitoring 654KB PDF download
  文献评价指标  
  下载次数:20次 浏览次数:30次