学位论文详细信息
GUMSMP: a scalable parallel Haskell implementation
QA75 Electronic computers. Computer science
Aljabri, Malak Saleh ; Trinder, Phil
University:University of Glasgow
Department:School of Computing Science
关键词: parallel, multi-core;   
Others  :  http://theses.gla.ac.uk/6822/1/2015MalakPhD.pdf
来源: University of Glasgow
PDF
【 摘 要 】

The most widely available high performance platforms today are hierarchical,with shared memory leaves, e.g. clusters of multi-cores, or NUMA with multipleregions. The Glasgow Haskell Compiler (GHC) provides a number of parallelHaskell implementations targeting different parallel architectures. In particular,GHC-SMP supports shared memory architectures, and GHC-GUM supportsdistributed memory machines. Both implementations use different, but related,runtime system (RTS) mechanisms and achieve good performance. A specialisedRTS for the ubiquitous hierarchical architectures is lacking.This thesis presents the design, implementation, and evaluation of a newparallel Haskell RTS, GUMSMP, that combines shared and distributed memorymechanisms to exploit hierarchical architectures more effectively. The designevaluates a variety of design choices and aims to efficiently combine scalabledistributed memory parallelism, using a virtual shared heap over a hierarchicalarchitecture, with low-overhead shared memory parallelism on shared memorynodes. Key design objectives in realising this system are to prefer local work,and to exploit mostly passive load distribution with pre-fetching.Systematic performance evaluation shows that the automatic hierarchical loaddistribution policies must be carefully tuned to obtain good performance. Weinvestigate the impact of several policies including work pre-fetching, favouringinter-node work distribution, and spark segregation with different export andselect policies. We present the performance results for GUMSMP, demonstratinggood scalability for a set of benchmarks on up to 300 cores. Moreover, our policiesprovide performance improvements of up to a factor of 1.5 compared to GHC-GUM.The thesis provides a performance evaluation of distributed and shared heapimplementations of parallel Haskell on a state-of-the-art physical shared memoryNUMA machine. The evaluation exposes bottlenecks in memory management,which limit scalability beyond 25 cores. We demonstrate that GUMSMP, thatcombines both distributed and shared heap abstractions, consistently outper-forms the shared memory GHC-SMP on seven benchmarks by a factor of 3.3on average. Specifically, we show that the best results are obtained when shar-ing memory only within a single NUMA region, and using distributed memorysystem abstractions across the regions.

【 预 览 】
附件列表
Files Size Format View
GUMSMP: a scalable parallel Haskell implementation 5390KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:21次