学位论文详细信息
Parallel merge for many-core architectures
Graphics processing unit (GPU);Parallel Merge
Lv, Jie ; Hwu ; Wen-Mei W.
关键词: Graphics processing unit (GPU);    Parallel Merge;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/90824/LV-THESIS-2016.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

This thesis proposes a novel GPU implementation for merging two sorted arrays.We consider the problem of merging two arrays A and B into a single array C. Each element in the arrays has a key. An ordering relation denoted by is defined on the keys. Array A and array B have m and n elements, respectively, where m and n do not have to be equal. Both array A and array B are sorted based on the ordering relation. The task is to produce the output array C of size m + n. Array C consists of all the input elements from array A and array B, and is sorted by the ordering relation.We applied several GPU-specific optimizations to a parallel merge algorithm. The optimizations include coordinating the memory access pattern, making full use of the shared memory and reducing the thread divergence. Our implementation achieves up to 10x and 40x speedup on Titan-Z and GTX 980 GPU respectively compared to thrust merge implementation.

【 预 览 】
附件列表
Files Size Format View
Parallel merge for many-core architectures 1369KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:20次