21st International Conference on Computing in High Energy and Nuclear Physics | |
LHCb experience with running jobs in virtual machines | |
物理学;计算机科学 | |
McNab, A.^1 ; Stagni, F.^2 ; Luzzi, C.^2 | |
School of Physics and Astronomy, University of Manchester, United Kingdom^1 | |
CERN, Switzerland^2 | |
关键词: Contextualisation; Distributed file-system; Operational experience; Production jobs; Production manager; Software stacks; Virtual-machine managers; Worker nodes; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/664/2/022030/pdf DOI : 10.1088/1742-6596/664/2/022030 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
The LHCb experiment has been running production jobs in virtual machines since 2013 as part of its DIRAC-based infrastructure. We describe the architecture of these virtual machines and the steps taken to replicate the WLCG worker node environment expected by user and production jobs. This relies on the uCernVM system for providing root images for virtual machines. We use the CernVM-FS distributed filesystem to supply the root partition files, the LHCb software stack, and the bootstrapping scripts necessary to configure the virtual machines for us. Using this approach, we have been able to minimise the amount of contextualisation which must be provided by the virtual machine managers. We explain the process by which the virtual machine is able to receive payload jobs submitted to DIRAC by users and production managers, and how this differs from payloads executed within conventional DIRAC pilot jobs on batch queue based sites. We describe our operational experiences in running production on VM based sites managed using Vcycle/OpenStack, Vac, and HTCondor Vacuum. Finally we show how our use of these resources is monitored using Ganglia and DIRAC.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
LHCb experience with running jobs in virtual machines | 864KB | download |