会议论文详细信息
24th IUPAP Conference on Computational Physics
Exploiting parallelism in many-core architectures: Lattice Boltzmann models as a test case
物理学;计算机科学
Mantovani, F.^1 ; Pivanti, M.^2 ; Schifano, S.F.^3 ; Tripiccione, R.^4
Department of Physics, Univesität Regensburg, Germany^1
Department of Physics, Università di Roma la Sapienza, Italy^2
Department of Mathematics and Informatics, Università di Ferrara and INFN, Italy^3
Department of Physics and CMCS, Università di Ferrara and INFN, Italy^4
关键词: Computational kernels;    Large scale simulations;    Lattice Boltzmann algorithms;    Lattice boltzmann models;    Many-core architecture;    Many-core processors;    Micro architectures;    Rayleigh-Taylor instabilities;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/454/1/012015/pdf
DOI  :  10.1088/1742-6596/454/1/012015
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

In this paper we address the problem of identifying and exploiting techniques that optimize the performance of large scale scientific codes on many-core processors. We consider as a test-bed a state-of-the-art Lattice Boltzmann (LB) model, that accurately reproduces the thermo-hydrodynamics of a 2D-fluid obeying the equations of state of a perfect gas. The regular structure of Lattice Boltzmann algorithms makes it relatively easy to identify a large degree of available parallelism; the challenge is that of mapping this parallelism onto processors whose architecture is becoming more and more complex, both in terms of an increasing number of independent cores and-within each core-of vector instructions on longer and longer data words. We take as an example the Intel Sandy Bridge micro-architecture, that supports AVX instructions operating on 256-bit vectors; we address the problem of efficiently implementing the key computational kernels of LB codes-streaming and collision-on this family of processors; we introduce several successive optimization steps and quantitatively assess the impact of each of them on performance. Our final result is a production-ready code already in use for large scale simulations of the Rayleigh-Taylor instability. We analyze both raw performance and scaling figures, and compare with GPU-based implementations of similar codes.

【 预 览 】
附件列表
Files Size Format View
Exploiting parallelism in many-core architectures: Lattice Boltzmann models as a test case 2135KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:20次