科技报告详细信息
Reliability Results of NERSC Systems
Petascale Data Storage Institute (PDSI) ; Mokhtarani, Akbar ; Mokhtarani, Akbar ; Kramer, William ; Hick, Jason
关键词: 99;    AVAILABILITY;    INSTABILITY;    PERFORMANCE;    RELIABILITY;    STORAGE;    SWITCHES System reliability component failure computing systems;   
DOI  :  10.2172/934480
RP-ID  :  LBNL-430E
PID  :  OSTI ID: 934480
Others  :  TRN: US200814%%354
美国|英语
来源: SciTech Connect
PDF
【 摘 要 】

In order to address the needs of future scientific applications for storing and accessing large amounts of data in an efficient way, one needs to understand the limitations of current technologies and how they may cause systeminstability or unavailability. A number of factors can impact system availability ranging from facility-wide power outage to a single point of failure such as network switches or global file systems. In addition, individual component failure in a system can degrade the performance of that system. This paper focuses on analyzing both of these factors and their impacts on the computational and storage systems at NERSC. Component failure data presented in this report primarily focuses on disk drive in on of the computational system and tape drive failure in HPSS. NERSC collected available component failure data and system-wide outages for its computational and storage systems over a six-year period and made them available to the HPC community through the Petascale Data Storage Institute.

【 预 览 】
附件列表
Files Size Format View
RO201705180001904LZ 1199KB PDF download
  文献评价指标  
  下载次数:16次 浏览次数:32次