会议论文详细信息
20th International Conference on Computing in High Energy and Nuclear Physics
ATLAS Job Transforms: A Data Driven Workflow Engine
物理学;计算机科学
Stewart, G.A.^1,2 ; Breaden-Madden, W.B.^2 ; Maddocks, H.J.^3 ; Harenberg, T.^4 ; Sandhoff, M.^4 ; Sarrazin, B.^5
CERN, Geneva 23
CH-1211, Switzerland^1
University of Glasgow, Glasgow
G12 8QQ, United Kingdom^2
University of Lancaster, Bailrigg, Lancashire, Lancaster
LA1 4YW, United Kingdom^3
Bergische Universität Wuppertal, Gaußstraße 20, Wuppertal
42119, Germany^4
Universität Bonn, Bonn
D-53012, Germany^5
关键词: Complex workflows;    Computing resource;    Execution costs;    Execution paths;    High energy physics experiments;    Multi-Processes;    Production system;    Workflow engines;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/513/3/032094/pdf
DOI  :  10.1088/1742-6596/513/3/032094
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

The need to run complex workflows for a high energy physics experiment such as ATLAS has always been present. However, as computing resources have become even more constrained, compared to the wealth of data generated by the LHC, the need to use resources efficiently and manage complex workflows within a single grid job have increased. In ATLAS, a new Job Transform framework has been developed that we describe in this paper. This framework manages the multiple execution steps needed to 'transform' one data type into another (e.g., RAW data to ESD to AOD to final ntuple) and also provides a consistent interface for the ATLAS production system. The new framework uses a data driven workflow definition which is both easy to manage and powerful. After a transform is defined, jobs are expressed simply by specifying the input data and the desired output data. The transform infrastructure then executes only the necessary substeps to produce the final data products. The global execution cost of running the job is minimised and the transform can adapt to scenarios where data can be produced along different execution paths. Transforms for specific physics tasks which support up to 60 individual substeps have been successfully run. As the new transforms infrastructure has been deployed in production many features have been added to the framework which improve reliability, quality of error reporting and also provide support for multi-process jobs.

【 预 览 】
附件列表
Files Size Format View
ATLAS Job Transforms: A Data Driven Workflow Engine 845KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:41次