学位论文

【摘要】

Scaling processor performance with future technology nodes is essential to enable future applications for devices ranging from smart-phones to servers. But the traditional methods of achieving that performance through frequency scaling and single-core architectural enhancements are no longer viable due to fundamental scaling limits. To continue scaling performance, parallel computers in the form of Chip Multi-processors (CMPs) are now prevalent, moving the challenge of parallel programming from a niche to the general domain.One challenging area is scalable synchronization to shared data structures using traditional methods. It can take many years for expert programmers using traditional methods to craft a scalable and correct scheme to synchronize access to data-structures in a complex program. Researchers have been searching for methods to make synchronization more tractable. One proposal is to use ;;Transactional Programming;; to abstract synchronization to shared data structures as transactions in a similar fashion as database operations. Transactional programming can be efficiently supported by using a ;;Transactional Memory;; (TM) system.One main problem with TM systems is scalability bottlenecks. When transactional applications are written to emulate future average programmer practices, performance can be worse than a single processor on large CMPs. This should not happen on a system meant to make programming easier.This happens because transactions as represented in the TM system may be dependent on each other--accessing the same data and therefore must serialize--without the programmer being knowledgable about these dependencies due to the abstraction hiding system details.This thesis develops a hardware/software approach to alleviate scalability bottlenecks in TM systems, while maintaining the level of abstraction presented in transactional programming. I first introduce ;;Proactive Transaction Scheduling;; (PTS), a technique that profiles parallel code at runtime to determine orders transactions should execute in to maintain acceptable forward progress. I then propose using PTS to automatically determine transactions causing large amounts of serialization. These transactions are then accelerated using an asymmetric CMP to get better performance. I also show PTS can be used to partition resources in a Multi-threaded processor core for better overall performance over a fair partitioning of resources.

【预览】

附件列表
Files	Size	Format	View
A Hardware/Software Approach for Alleviating Scalability Bottlenecks in Transactional Memory Applications.	3790KB	PDF	download


A Hardware/Software Approach for Alleviating Scalability Bottlenecks in Transactional Memory Applications.
Hardware Transactional Memory;Scheduling;Chip Multi-processors;Computer Science;Engineering;Computer Science & Engineering
Blake, Geoffrey WymanWenisch, Thomas F. ;
University of Michigan
关键词: Hardware Transactional Memory; Scheduling; Chip Multi-processors; Computer Science; Engineering; Computer Science & Engineering;
Others : https://deepblue.lib.umich.edu/bitstream/handle/2027.42/86452/blakeg_1.pdf?sequence=1&isAllowed=y
瑞士\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：15次	浏览次数：37次

【 摘 要 】

【 预 览 】

【摘要】

【预览】