学位论文

【摘要】

In multicores, performance-critical synchronization is increasingly performed in a lock-free manner using atomic instructions such as CAS or LL/SC. However, when many processors synchronize on the same variable, performance can still degrade significantly. Contending writes get serialized, creating a non-scalable condition. Past proposals that build hardware queues of synchronizing processors do not fundamentally solve this problem. At best, they help to efficiently serialize the contending writes.We propose a novel architecture that breaks the serialization of hardware queues and enables the queued processors to perform lock-free synchronization in parallel. The architecture, called Caspar, is able to (1) execute the CASes in the queued-up processors in parallel through eager forwarding of expected values, and (2) validate the CASes in parallel and dequeue groups of processors at a time. The result is highly scalable synchronization. We evaluate Caspar with simulations of a 64-core chip. Compared to existing proposals with hardware queues, Caspar improves the throughput of kernels by 32% on average and reduces the execution time of the sections considered in lock-free versions of applications by 47% on average. This makes these sections 2.5x faster than in the original applications.

【预览】

附件列表
Files	Size	Format	View
Breaking serialization in lock-free multicore synchronization	630KB	PDF	download


Breaking serialization in lock-free multicore synchronization
lock-free synchronization;serialization;parallel programming
Gangwani, Tanmay ; Torrellas ; Josep
关键词: lock-free synchronization; serialization; parallel programming;
Others : https://www.ideals.illinois.edu/bitstream/handle/2142/92858/GANGWANI-THESIS-2016.pdf?sequence=1&isAllowed=y
美国\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：32次	浏览次数：15次

【 摘 要 】

【 预 览 】

【摘要】

【预览】