科技报告详细信息
Final Report for UC Berkeley Terascale Optimal PDE Solvers TOPS DOE Award Number DE-FC02-01ER25478 9/15/2001 – 9/14/2006
Demmel, James
University of California, Berkeley
关键词: Tuning;    99 General And Miscellaneous//Mathematics, Computing, And Information Science;    Computers;    Implementation;    Verification Scidac, Terascale Optimal Pde Solvers, Tops, Sparse Matrices, Automatic Performance Tuning, Sparse Solvers;   
DOI  :  10.2172/899881
RP-ID  :  DOE/ER/25478-5
RP-ID  :  FC02-01ER25478
RP-ID  :  899881
美国|英语
来源: UNT Digital Library
PDF
【 摘 要 】

In many areas of science, physical experimentation may be too dangerous, too expensive or even impossible. Instead, large-scale simulations, validated by comparison with related experiments in well-understood laboratory contexts, are used by scientists to gain insight and confirmation of existing theories in such areas, without benefit of full experimental verification. The goal of the TOPS ISIC was to develop and implement algorithms and support scientific investigations performed by DOE-sponsored researchers. A major component of this effort is to provide software for large scale parallel computers capable of efficiently solving the enormous systems of equations arising from the nonlinear PDEs underlying these simulations. Several TOPS supported packages where designed in part (ScaLAPACK) or in whole (SuperLU) at Berkeley, and are widely used beyond SciDAC and DOE. Beyond continuing to develop these codes, our main effort focused on automatic performance tuning of the sparse matrix kernels (eg sparse-matrix-vector-multiply, or SpMV) at the core of many TOPS iterative solvers. Based on the observation that the fastest implementation of SpMV (and other kernels) can depend dramatically both on the computer and the matrix (the latter of which is not known until run-time), we developed and released a system called OSKI (Optimized Sparse Kernel Interface) that will automatically produce optimized version of SpMV (and other kernels), hiding complicated implementation details from the user. OSKI led to a 2x speedup in SpMV in a DOE accelerator design code, a 2x speedup in a commercial lithography simulation, and has been downloaded over 500 times. In addition to a stand-alone version, OSKI was also integrated into the TOPS-supported PETSc system.

【 预 览 】
附件列表
Files Size Format View
899881.pdf 49KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:45次