学位论文详细信息
Algorithmic approaches to enhancing and exploiting application-level error tolerance
Fault Tolerance;Application-level Error Tolerance;Algorithmic Based Fault Tolerance (ABFT);Application Robustification;Stochastic Processors;Reliability and Hardware Variability;Error localization;Partial Recomputation;Robust Sparse Linear Algebra;Algorithmic Selection for Error Resilience
Sloan, Joseph
关键词: Fault Tolerance;    Application-level Error Tolerance;    Algorithmic Based Fault Tolerance (ABFT);    Application Robustification;    Stochastic Processors;    Reliability and Hardware Variability;    Error localization;    Partial Recomputation;    Robust Sparse Linear Algebra;    Algorithmic Selection for Error Resilience;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/46706/Joseph_Sloan.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

As late-CMOS process scaling leads to increasingly variable circuits/logic and as most post-CMOS technologies in sight appear to have largely stochastic characteristics, hardware reliability has become a first-order design concern. To make matters worse, emerging computing systems are becoming increasingly power constrained.Traditional hardware/software approaches are likely to be impractical for these power constrained systems due to their heavy reliance on redundant, worstcase, and conservative designs. The primary goal of this research has been to investigate how we can leverage inherent application and algorithm characteristics (e.g. natural error resilience, spatial and temporal reuse, and fault containment) to build more efficient robust systems. This dissertation research describes algorithmic approaches that leverage application and algorithm-awareness for building such systems. These approaches include (a) application-specific techniques for low-overhead fault detection, (b) an algorithmic approach for error correction using localization, (c) selection of scientific computing solver schemes to leverage application-level error resilience, and (d) a numerical optimization-based methodology for converting applications into a more error tolerant form. This dissertation shows that application and algorithm-awareness can significantly increase the robustness of computing systems, while also reducing the cost of meeting reliability targets.

【 预 览 】
附件列表
Files Size Format View
Algorithmic approaches to enhancing and exploiting application-level error tolerance 12747KB PDF download
  文献评价指标  
  下载次数:27次 浏览次数:12次