会议论文详细信息
20th International Conference on Computing in High Energy and Nuclear Physics | |
HEPDOOP: High-Energy Physics Analysis using Hadoop | |
物理学;计算机科学 | |
Bhimji, W.^1 ; Bristow, T.^1 ; Washbrook, A.^1 | |
SUPA School of Physics and Astronomy, University of Edinburgh, Edinburgh, United Kingdom^1 | |
关键词: Analysis workflow; Binary files; Mass data processing; Multi variate analysis; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/513/2/022004/pdf DOI : 10.1088/1742-6596/513/2/022004 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
We perform a LHC data analysis workflow using tools and data formats that are commonly used in the "Big Data" community outside High Energy Physics (HEP). These include Apache Avro for serialisation to binary files, Pig and Hadoop for mass data processing and Python Scikit-Learn for multi-variate analysis. Comparison is made with the same analysis performed with current HEP tools in ROOT.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
HEPDOOP: High-Energy Physics Analysis using Hadoop | 993KB | download |