科技报告详细信息
Hierarchical resilience with lightweight threads.
Wheeler, Kyle Bruce
关键词: ACCELERATORS;    COMMUNICATIONS;    DESIGN;    FUNCTIONALS;    KERNELS;    PROGRAMMING;   
DOI  :  10.2172/1029809
RP-ID  :  SAND2011-7604
PID  :  OSTI ID: 1029809
Others  :  TRN: US1200049
学科分类:核物理和高能物理
美国|英语
来源: SciTech Connect
PDF
【 摘 要 】
This paper proposes methodology for providing robustness and resilience for a highly threaded distributed- and shared-memory environment based on well-defined inputs and outputs to lightweight tasks. These inputs and outputs form a failure 'barrier', allowing tasks to be restarted or duplicated as necessary. These barriers must be expanded based on task behavior, such as communication between tasks, but do not prohibit any given behavior. One of the trends in high-performance computing codes seems to be a trend toward self-contained functions that mimic functional programming. Software designers are trending toward a model of software design where their core functions are specified in side-effect free or low-side-effect ways, wherein the inputs and outputs of the functions are well-defined. This provides the ability to copy the inputs to wherever they need to be - whether that's the other side of the PCI bus or the other side of the network - do work on that input using local memory, and then copy the outputs back (as needed). This design pattern is popular among new distributed threading environment designs. Such designs include the Barcelona STARS system, distributed OpenMP systems, the Habanero-C and Habanero-Java systems from Vivek Sarkar at Rice University, the HPX/ParalleX model from LSU, as well as our own Scalable Parallel Runtime effort (SPR) and the Trilinos stateless kernels. This design pattern is also shared by CUDA and several OpenMP extensions for GPU-type accelerators (e.g. the PGI OpenMP extensions).
【 预 览 】
附件列表
Files Size Format View
RO201704210000810LZ 230KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:25次