科技报告详细信息
Tri-Laboratory Linux Capacity Cluster 2007 SOW
Seager, M
Lawrence Livermore National Laboratory
关键词: 98 Nuclear Disarmament, Safeguards, And Physical Protection;    Capacity;    Computers;    Processing;    Simulation;   
DOI  :  10.2172/1036298
RP-ID  :  UCRL-SR-229520
RP-ID  :  W-7405-ENG-48
RP-ID  :  1036298
美国|英语
来源: UNT Digital Library
PDF
【 摘 要 】

The Advanced Simulation and Computing (ASC) Program (formerly know as Accelerated Strategic Computing Initiative, ASCI) has led the world in capability computing for the last ten years. Capability computing is defined as a world-class platform (in the Top10 of the Top500.org list) with scientific simulations running at scale on the platform. Example systems are ASCI Red, Blue-Pacific, Blue-Mountain, White, Q, RedStorm, and Purple. ASC applications have scaled to multiple thousands of CPUs and accomplished a long list of mission milestones on these ASC capability platforms. However, the computing demands of the ASC and Stockpile Stewardship programs also include a vast number of smaller scale runs for day-to-day simulations. Indeed, every 'hero' capability run requires many hundreds to thousands of much smaller runs in preparation and post processing activities. In addition, there are many aspects of the Stockpile Stewardship Program (SSP) that can be directly accomplished with these so-called 'capacity' calculations. The need for capacity is now so great within the program that it is increasingly difficult to allocate the computer resources required by the larger capability runs. To rectify the current 'capacity' computing resource shortfall, the ASC program has allocated a large portion of the overall ASC platforms budget to 'capacity' systems. In addition, within the next five to ten years the Life Extension Programs (LEPs) for major nuclear weapons systems must be accomplished. These LEPs and other SSP programmatic elements will further drive the need for capacity calculations and hence 'capacity' systems as well as future ASC capability calculations on 'capability' systems. To respond to this new workload analysis, the ASC program will be making a large sustained strategic investment in these capacity systems over the next ten years, starting with the United States Government Fiscal Year 2007 (GFY07). However, given the growing need for 'capability' systems as well, the budget demands are extreme and new, more cost effective ways of fielding these systems must be developed. This Tri-Laboratory Linux Capacity Cluster (TLCC) procurement represents the ASC first investment vehicle in these capacity systems. It also represents a new strategy for quickly building, fielding and integrating many Linux clusters of various sizes into classified and unclassified production service through a concept of Scalable Units (SU). The programmatic objective is to dramatically reduce the overall Total Cost of Ownership (TCO) of these 'capacity' systems relative to the best practices in Linux Cluster deployments today. This objective only makes sense in the context of these systems quickly becoming very robust and useful production clusters under the crushing load that will be inflicted on them by the ASC and SSP scientific simulation capacity workload.

【 预 览 】
附件列表
Files Size Format View
1036298.pdf 1391KB PDF download
  文献评价指标  
  下载次数:40次 浏览次数:8次