How to program a parallel machine has always been a major research problem. Many tools, languages and libraries are developed in order to make parallel programming more accessible for most users. However, no matter what approach is taken to program a parallel machine, there is always a trade-off between productivity, performance and portability. It is very hard to develop a system that only requires short and concise code to achieve close-to-optimal performance on a wide range of parallel machines.In this thesis, a novel programming framework is developed to achieve a good combination of productivity, performance and portability. The programming framework is designed based on computation patterns that contain parallel information. The programming framework can efficiently map these computation patterns onto a parallel machine. The programming framework also utilizes the C++ templates to generate optimized code for different compositions of computation patterns. It uses a novel way to implement the computation patterns that allow automatic high-level optimization at compile time. Through the benchmarks, it shows that the programming framework can effectively express the computation kernels in few lines of code and achieve the performance of their optimized C code on multi-core CPUs.
【 预 览 】
附件列表
Files
Size
Format
View
High-performance parallel programming framework using template-based static optimization