会议论文详细信息
16th International workshop on Advanced Computing and Analysis Techniques in physics research
Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments
物理学;计算机科学
Makatun, Dzmitry^1,3 ; Lauret, Jérôme^2 ; Rudová, Hana^4 ; Šumbera, Michal^3
Faculty of Nuclear Physics and Physical Engineering, Czech Technical University in Prague, Czech Republic^1
STAR, Brookhaven National Laboratory, United States^2
Nuclear Physics Institute, Academy of Sciences, Czech Republic^3
Masaryk University, Czech Republic^4
关键词: Computation performance;    Constraint programming;    Data management system;    Data-intensive application;    Distributed computational resources;    Distributed environments;    Performance improvements;    Resource utilizations;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/608/1/012028A/pdf
DOI  :  10.1088/1742-6596/608/1/012028A
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

When running data intensive applications on distributed computational resources long I/O overheads may be observed as access to remotely stored data is performed. Latencies and bandwidth can become the major limiting factor for the overall computation performance and can reduce the CPU/WallTime ratio to excessive IO wait. Reusing the knowledge of our previous research, we propose a constraint programming based planner that schedules computational jobs and data placements (transfers) in a distributed environment in order to optimize resource utilization and reduce the overall processing completion time. The optimization is achieved by ensuring that none of the resources (network links, data storages and CPUs) are oversaturated at any moment of time and either (a) that the data is pre-placed at the site where the job runs or (b) that the jobs are scheduled where the data is already present. Such an approach eliminates the idle CPU cycles occurring when the job is waiting for the I/O from a remote site and would have wide application in the community. Our planner was evaluated and simulated based on data extracted from log files of batch and data management systems of the STAR experiment. The results of evaluation and estimation of performance improvements are discussed in this paper.

【 预 览 】
附件列表
Files Size Format View
Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments 825KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:22次