21st International Conference on Computing in High Energy and Nuclear Physics | |
Efficient provisioning for multi-core applications with LSF | |
物理学;计算机科学 | |
Dal Pra, Stefano^1 | |
INFN-CNAF, viale Berti-Pichat 6/2, Bologna | |
40127, Italy^1 | |
关键词: Average numbers; Batch systems; Computing power; HEP experiments; High throughput; Overall efficiency; Resource provisioning; Task force; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/664/5/052008/pdf DOI : 10.1088/1742-6596/664/5/052008 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
Tier-1 sites providing computing power for HEP experiments are usually tightly designed for high throughput performances. This is pursued by reducing the variety of supported use cases and tuning for performances those ones, the most important of which have been that of singlecore jobs. Moreover, the usual workload is saturation: each available core in the farm is in use and there are queued jobs waiting for their turn to run. Enabling multi-core jobs thus requires dedicating a number of hosts where to run, and waiting for them to free the needed number of cores. This drain-time introduces a loss of computing power driven by the number of unusable empty cores. As an increasing demand for multi-core capable resources have emerged, a Task Force have been constituted in WLCG, with the goal to define a simple and efficient multi-core resource provisioning model. This paper details the work done at the INFN Tier-1 to enable multi-core support for the LSF batch system, with the intent of reducing to the minimum the average number of unused cores. The adopted strategy has been that of dedicating to multi-core a dynamic set of nodes, whose dimension is mainly driven by the number of pending multi-core requests and fair-share priority of the submitting user. The node status transition, from single to multi core et vice versa, is driven by a finite state machine which is implemented in a custom multi-core director script, running in the cluster. After describing and motivating both the implementation and the details specific to the LSF batch system, results about performance are reported. Factors having positive and negative impact on the overall efficiency are discussed and solutions to reduce at most the negative ones are proposed.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Efficient provisioning for multi-core applications with LSF | 1286KB | download |