21st International Conference on Computing in High Energy and Nuclear Physics | |
A study of dynamic data placement for ATLAS distributed data management | |
物理学;计算机科学 | |
Beermann, T.^1,3 ; Stewart, G.A.^2 ; Maettig, P.^3 | |
CERN, Geneva, Switzerland^1 | |
University of Glasgow, Glasgow, United Kingdom^2 | |
University of Wuppertal, Wuppertal, Germany^3 | |
关键词: Average waiting-time; Computing resource; Data distribution; Data-intensive systems; Distributed data managements; Forecasting methods; Prediction methods; Redistribution algorithms; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/664/3/032002/pdf DOI : 10.1088/1742-6596/664/3/032002 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
This contribution presents a study on the applicability and usefulness of dynamic data placement methods for data-intensive systems, such as ATLAS distributed data management (DDM). In this system the jobs are sent to the data, therefore having a good distribution of data is significant. Ways of forecasting workload patterns are examined which then are used to redistribute data to achieve a better overall utilisation of computing resources and to reduce waiting time for jobs before they can run on the grid. This method is based on a tracer infrastructure that is able to monitor and store historical data accesses and which is used to create popularity reports. These reports provide detailed summaries about data accesses in the past, including information about the accessed files, the involved users and the sites. From this past data it is possible to then make near-term forecasts for data popularity in the future. This study evaluates simple prediction methods as well as more complex methods like neural networks. Based on the outcome of the predictions a redistribution algorithm deletes unused replicas and adds new replicas for potentially popular datasets. Finally, a grid simulator is used to examine the effects of the redistribution. The simulator replays workload on different data distributions while measuring the job waiting time and site usage. The study examines how the average waiting time is affected by the amount of data that is moved, how it differs for the various forecasting methods and how that compares to the optimal data distribution.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
A study of dynamic data placement for ATLAS distributed data management | 1260KB | download |