Final Report for UC Berkeley Terascale Optimal PDE Solvers TOPS DOE Award Number DE-FC02-01ER25478 9/15/2001 – 9/14/2006 | |
Demmel, James | |
University of California, Berkeley | |
关键词: Tuning; 99 General And Miscellaneous//Mathematics, Computing, And Information Science; Computers; Implementation; Verification Scidac, Terascale Optimal Pde Solvers, Tops, Sparse Matrices, Automatic Performance Tuning, Sparse Solvers; | |
DOI : 10.2172/899881 RP-ID : DOE/ER/25478-5 RP-ID : FC02-01ER25478 RP-ID : 899881 |
|
美国|英语 | |
来源: UNT Digital Library | |
【 摘 要 】
In many areas of science, physical experimentation may be too dangerous, too expensive or even impossible. Instead, large-scale simulations, validated by comparison with related experiments in well-understood laboratory contexts, are used by scientists to gain insight and confirmation of existing theories in such areas, without benefit of full experimental verification. The goal of the TOPS ISIC was to develop and implement algorithms and support scientific investigations performed by DOE-sponsored researchers. A major component of this effort is to provide software for large scale parallel computers capable of efficiently solving the enormous systems of equations arising from the nonlinear PDEs underlying these simulations. Several TOPS supported packages where designed in part (ScaLAPACK) or in whole (SuperLU) at Berkeley, and are widely used beyond SciDAC and DOE. Beyond continuing to develop these codes, our main effort focused on automatic performance tuning of the sparse matrix kernels (eg sparse-matrix-vector-multiply, or SpMV) at the core of many TOPS iterative solvers. Based on the observation that the fastest implementation of SpMV (and other kernels) can depend dramatically both on the computer and the matrix (the latter of which is not known until run-time), we developed and released a system called OSKI (Optimized Sparse Kernel Interface) that will automatically produce optimized version of SpMV (and other kernels), hiding complicated implementation details from the user. OSKI led to a 2x speedup in SpMV in a DOE accelerator design code, a 2x speedup in a commercial lithography simulation, and has been downloaded over 500 times. In addition to a stand-alone version, OSKI was also integrated into the TOPS-supported PETSc system.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
899881.pdf | 49KB | download |