会议论文详细信息
21st International Conference on Computing in High Energy and Nuclear Physics
A study of dynamic data placement for ATLAS distributed data management
物理学;计算机科学
Beermann, T.^1,3 ; Stewart, G.A.^2 ; Maettig, P.^3
CERN, Geneva, Switzerland^1
University of Glasgow, Glasgow, United Kingdom^2
University of Wuppertal, Wuppertal, Germany^3
关键词: Average waiting-time;    Computing resource;    Data distribution;    Data-intensive systems;    Distributed data managements;    Forecasting methods;    Prediction methods;    Redistribution algorithms;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/664/3/032002/pdf
DOI  :  10.1088/1742-6596/664/3/032002
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

This contribution presents a study on the applicability and usefulness of dynamic data placement methods for data-intensive systems, such as ATLAS distributed data management (DDM). In this system the jobs are sent to the data, therefore having a good distribution of data is significant. Ways of forecasting workload patterns are examined which then are used to redistribute data to achieve a better overall utilisation of computing resources and to reduce waiting time for jobs before they can run on the grid. This method is based on a tracer infrastructure that is able to monitor and store historical data accesses and which is used to create popularity reports. These reports provide detailed summaries about data accesses in the past, including information about the accessed files, the involved users and the sites. From this past data it is possible to then make near-term forecasts for data popularity in the future. This study evaluates simple prediction methods as well as more complex methods like neural networks. Based on the outcome of the predictions a redistribution algorithm deletes unused replicas and adds new replicas for potentially popular datasets. Finally, a grid simulator is used to examine the effects of the redistribution. The simulator replays workload on different data distributions while measuring the job waiting time and site usage. The study examines how the average waiting time is affected by the amount of data that is moved, how it differs for the various forecasting methods and how that compares to the optimal data distribution.

【 预 览 】
附件列表
Files Size Format View
A study of dynamic data placement for ATLAS distributed data management 1260KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:13次