会议论文详细信息
17th International Workshop on Advanced Computing and Analysis Techniques in Physics Research
Data Locality via Coordinated Caching for Distributed Processing
物理学;计算机科学
Fischer, M.^1 ; Kuehn, E.^1 ; Giffels, M.^1 ; Jung, C.^1
Karlsruhe Institute of Technology, Steinbuch Centre for Computing, Hermann-von-Helmholtz-Platz 1, Eggenstein-Leopoldshafen
76344, Germany^1
关键词: Batch systems;    Data locality;    Distributed processing;    Large networks;    Network bandwidth;    Performance improvements;    Required functionalities;    Storage spaces;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/762/1/012011/pdf
DOI  :  10.1088/1742-6596/762/1/012011
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

To enable data locality, we have developed an approach of adding coordinated caches to existing compute clusters. Since the data stored locally is volatile and selected dynamically, only a fraction of local storage space is required. Our approach allows to freely select the degree at which data locality is provided. It may be used to work in conjunction with large network bandwidths, providing only highly used data to reduce peak loads. Alternatively, local storage may be scaled up to perform data analysis even with low network bandwidth. To prove the applicability of our approach, we have developed a prototype implementing all required functionality. It integrates seamlessly into batch systems, requiring practically no adjustments by users. We have now been actively using this prototype on a test cluster for HEP analyses. Specifically, it has been integral to our jet energy calibration analyses for CMS during run 2. The system has proven to be easily usable, while providing substantial performance improvements. Since confirming the applicability for our use case, we have investigated the design in a more general way. Simulations show that many infrastructure setups can benefit from our approach. For example, it may enable us to dynamically provide data locality in opportunistic cloud resources. The experience we have gained from our prototype enables us to realistically assess the feasibility for general production use.

【 预 览 】
附件列表
Files Size Format View
Data Locality via Coordinated Caching for Distributed Processing 924KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:28次