21st International Conference on Computing in High Energy and Nuclear Physics | |
Archiving Scientific Data Outside of the Traditional HEP Domain, Using the Archive Facilities at Fermilab | |
物理学;计算机科学 | |
Norman, A.^1 ; Diesbug, M.^1 ; Gheith, M.^1 ; Illingworth, R.^1 ; Mengel, M.^1 | |
Fermi National Accelerator Laboratory, Bativa | |
IL, United States^1 | |
关键词: Archival storages; Dark matter detectors; Data acquisition system; Data handling systems; Distributed computing resources; Open science grid; River laboratories; Sudbury neutrino observatories; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/664/4/042039/pdf DOI : 10.1088/1742-6596/664/4/042039 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
Many experiments in the HEP and Astrophysics communities generate large extremely valuable datasets, which need to be efficiently cataloged and recorded to archival storage. These datasets, both new and legacy, are often structured in a manner that is not conducive to storage and cataloging with modern data handling systems and large file archive facilities. In this paper we discuss in detail how we have created a robust toolset and simple portal into the Fermilab archive facilities, which allows for scientific data to be quickly imported, organized and retrieved from the multi-petabyte facility. In particular we discuss how the data from the Sudbury Neutrino Observatory (SNO) for the COUPP dark matter detector was aggregated, cataloged, archived and re-organized to permit it to be retrieved and analyzed using modern distributed computing resources both at Fermilab and on the Open Science Grid. We pay particular attention to the methods that were employed to uniquify the namespaces for the data, derive metadata for the over 460,000 image series taken by the COUP experiment and what was required to map that information into coherent datasets that could be stored and retrieved using the large scale archives systems. We describe the data transfer and cataloging engines that are used for data importation and how these engines have been setup to import data from the data acquisition systems of ongoing experiments at non-Fermilab remote sites including the Laboratori Nazionali del Gran Sasso and the Ash River Laboratory in Orr, Minnesota. We also describe how large University computing sites around the world are using the system to store and retrieve large volumes of simulation and experiment data for physics analysis.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Archiving Scientific Data Outside of the Traditional HEP Domain, Using the Archive Facilities at Fermilab | 1765KB | download |