科技报告

【摘要】

In the race to PetaFLOP-speed supercomputing systems, the increase in computational capability has been accompanied by corresponding increases in CPU count, total RAM, and storage capacity. However, a proportional increase in storage bandwidth has lagged behind. In order to improve system reliability and to reduce maintenance effort for modern large-scale systems, system designers have opted to remove node-local storage from the compute nodes. Today's multi-TeraFLOP supercomputers are typically attached to parallel file systems that provide only tens of GBs/s of I/O bandwidth. As a result, such machines have access to much less than 1GB/s of I/O bandwidth per TeraFLOP of compute power, which is below the generally accepted limit required for a well-balanced system. In a many ways, the current I/O bottleneck limits the capabilities of modern supercomputers, specifically in terms of limiting their working sets and restricting fault tolerance techniques, which become critical on systems consisting of tens of thousands of components. This paper resolves the dilemma between high performance and high reliability by presenting an alternative system design which makes use of node-local storage to improve aggregate system I/O bandwidth. In this work, we focus on the checkpointing use-case and present an experimental evaluation of the Scalable Checkpoint/Restart (SCR) library, a new adaptive checkpointing library that uses node-local storage to significantly improve the checkpointing performance of large-scale supercomputers. Experiments show that SCR achieves unprecedented write speeds, reaching a measured 700GB/s of aggregate bandwidth on 8,752 processors and an estimated 1TB/s for a similarly structured machine of 12,500 processors. This corresponds to a speedup of over 70x compared to the bandwidth provided by the 10GB/s parallel file system the cluster uses. Further, SCR can adapt to an environment in which there is wide variation in performance or capacity among the individual node-local storage elements.

【预览】

附件列表
Files	Size	Format	View
RO201705170001307LZ	5787KB	PDF	download


Scalable I/O Systems via Node-Local Storage: Approaching 1 TB/sec File I/O

Bronevetsky, G ; Moody, A
关键词: CAPACITY; DESIGN; EVALUATION; MAINTENANCE; PERFORMANCE; RELIABILITY; STORAGE; SUPERCOMPUTERS; TOLERANCE;
DOI : 10.2172/964079 RP-ID : LLNL-TR-415791 PID : OSTI ID: 964079 Others : TRN: US200919%%180
学科分类：社会科学、人文和艺术（综合）
美国\|英语
来源: SciTech Connect
PDF


	文献评价指标
	下载次数：10次	浏览次数：15次

【 摘 要 】

【 预 览 】

【摘要】

【预览】