学位论文详细信息
Compilers for portable programming of heterogeneous parallel & approximate computing systems
Heterogeneous;Approximate;Programmability;GPUs
Srivastava, Prakalp
关键词: Heterogeneous;    Approximate;    Programmability;    GPUs;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/104762/SRIVASTAVA-DISSERTATION-2019.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Programming heterogeneous systems such as the System-on-chip (SoC) processors in modern mobile devices can be extremely complex because a single system may include multiple different parallelism models, instruction sets, memory hierarchies, and systems use different combinations of these features. This is further complicated by software and hardware approximate computing optimizations. Different compute units on an SoC use different approximate computing methods and an application would usually be composed of multiple compute kernels, each one specialized to run on a different hardware. Determining how best to map such an application to a modern heterogeneous system is an open research problem.First, we propose a parallel abstraction of heterogeneous hardware that is a carefully chosen combination of well-known parallel models and is able to capture the parallelism in a wide range of popular parallel hardware. This abstraction uses a hierarchical dataflow graph with side effects and vector SIMD instructions. We use this abstraction to define a parallel program representation called HPVM that aims to address both functional portability and performance portability across heterogeneous systems.Second, we further extend HPVM representation to enable accuracy-aware performance and energy tuning on heterogeneous systems with multiple compute units and approximation methods. We call it ApproxHPVM, and it automatically translates end-to-end application-level accuracy constraints into accuracy requirements for individual operations. ApproxHPVM uses a hardware-agnostic accuracy-tuning phase to do this translation, which greatly speeds up the analysis, enables greater portability, and enables future capabilities like accuracy-aware dynamic scheduling and design space exploration. We have implemented a prototype HPVM system, defining the HPVM IR as an extension of the LLVM compiler IR, compiler optimizations that operate directly on HPVM graphs, and code generators that translate the virtual ISA to NVIDIA GPUs, Intel’s AVX vector units, and to multicore X86-64 processors. Experimental results show that HPVM optimizations achieve significant performance improvements, HPVM translators achieve performance competitive with manually developed OpenCL code for both GPUs and vector hardware, and that runtime scheduling policies can make use of both program and runtime information to exploit the flexible compilation capabilities. Furthermore, our evaluation of ApproxHPVM shows that our framework can offload chunks of approximable computations to special purpose accelerators that provide significant gains in performance and energy, while staying within a user-specified application-level accuracy constraint with high probability.

【 预 览 】
附件列表
Files Size Format View
Compilers for portable programming of heterogeneous parallel & approximate computing systems 4232KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:5次