会议论文详细信息
11th International Conference on "Mesh methods for boundary-value problems and applications"
Scalability of parallel finite element algorithms on multi-core platforms
Kopysov, S.P.^1 ; Novikov, A.K.^1 ; Nedozhogin, N.S.^1 ; Rychkov, V.N.^1
Institute of Mechanics, Ural Branch of the Russian Academy of Sciences, 34 T. Baramzinoy, Izhevsk
426067, Russia^1
关键词: Computational scalability;    Finite element algorithms;    Multi-core platforms;    Multi-core processor;    Multicore architectures;    Numerical experiments;    Processor performance;    Unstructured meshes;   
Others  :  https://iopscience.iop.org/article/10.1088/1757-899X/158/1/012055/pdf
DOI  :  10.1088/1757-899X/158/1/012055
来源: IOP
PDF
【 摘 要 】

The speedup of element-by-element FEM algorithms depends not only on peak processor performance but also on access time to shared mesh data. Eliminating memory boundness would significantly speed up unstructured mesh computations on hybrid multi-core architectures, where the gap between processor and memory performance continues to grow. The speedup can be achieved by ordering unknowns so that only those elements are processed in parallel which do not have common nodes. Therefore, memory conflicts are minimized. FEM assembly is performed with respect to the ordering, which defines how to compose vectors. Mesh can be partitioned into disjoint subdomains by using different layer-by-layer schemes. In this work, we evaluated several partitioning schemes (block, odd, even, and their modifications) on multi-core platforms, using Gunther's Universal Law of Computational Scalability. We performed numerical experiments with element-by-element matrix-vector multiplication on unstructured meshes on multi-core processors accelerated by MIC and GPU. With ordering, we achieved 5-times speedup on CPU, 40-times speedup on MIC, and 200- times speedup on GPU.

【 预 览 】
附件列表
Files Size Format View
Scalability of parallel finite element algorithms on multi-core platforms 1112KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:33次