Trends in computer engineering place renewed emphasis on increasing parallelism and heterogeneity.The rise of parallelism adds an additional dimension to the challenge of portability, asdifferent processors support different notions of parallelism, whether vector parallelism executingin a few threads on multicore CPUs or large-scale thread hierarchies on GPUs. Thus, softwareexperiences obstacles to portability and efficient execution beyond differences in instruction sets;rather, the underlying execution models of radically different architectures may not be compatible.Dynamic compilation applied to data-parallel heterogeneous architectures presents an abstractionlayer decoupling program representations from optimized binaries, thus enabling portability withoutencumbering performance. This dissertation proposes several techniques that extend dynamiccompilation to data-parallel execution models. These contributions include:- characterization of data-parallel workloads- machine-independent application metrics- framework for performance modeling and prediction- execution model translation for vector processors- region-based compilation and schedulingWe evaluate these claims via the development of a novel dynamic compilation framework,GPU Ocelot, with which we execute real-world workloads from GPU computing. This enablesthe execution of GPU computing workloads to run efficiently on multicore CPUs, GPUs, and afunctional simulator. We show data-parallel workloads exhibit performance scaling, take advantageof vector instruction set extensions, and effectively exploit data locality via scheduling whichattempts to maximize control locality.
【 预 览 】
附件列表
Files
Size
Format
View
A model of dynamic compilation for heterogeneous compute platforms