期刊论文

【摘要】

GPU spatial multitasking has been proven to be quite effective at executing different applications concurrently using SM partitioning. However, while it maximizes total throughput, latency-critical applications often cannot meet their deadlines due to the increased execution time. Furthermore, SM partitioning cannot allocate the appropriate L1 cache size per kernel. To solve these problems, this paper proposes a new application-aware resource allocation framework called GPU Fine-Tuner, for assigning appropriate resources to GPU kernels. To minimize the execution time of latency-constrained applications, it assigns them more SMs when performance is not affected. It also increases the cache size of SMs for cache-sensitive kernels using resource borrowing from neighbors for cache-insensitive kernels. Experimental results show that the Fine-Tuner outperforms GPU spatial multitasking with up to 15% less average latency without performance degradation.

【授权许可】

CC BY

【预览】

附件列表
Files	Size	Format	View
RO201902198269022ZK.pdf	2620KB	PDF	download

IEICE Electronics Express
Efficient GPU multitasking with latency minimization and cache boosting

Yongjun Park¹ Jiho Kim¹ Minsung Chu¹
[1] School of Electronic and Electrical Engineering, Hongik University
关键词: GPGPU; multitasking; energy; resource sharing; workload balancing;
DOI : 10.1587/elex.14.20161158
学科分类：电子、光学、磁材料
来源: Denshi Jouhou Tsuushin Gakkai
PDF


	文献评价指标
	下载次数：11次	浏览次数：28次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】