学位论文详细信息
Automatic recovery for request oriented systems
Recovery;Transactions;Operating Systems;Compilers
Lenharth, Andrew D.
关键词: Recovery;    Transactions;    Operating Systems;    Compilers;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/18421/Lenharth_Andrew.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Gracefully recovering from software and hardware faults is important to ensuringhighly reliable and available systems. Operating systems have privilegedaccess to all aspects of system operation, thus a fault related to themis able to affect the entire system. Existing approaches to operating systemrecovery either do not protect the entire system or require a completely newoperating system design.This dissertation presents a new approach to fault recovery in operatingsystems called Recovery Domains. This approach allows recovery fromunanticipated faults in commodity operating systems. Recovery is organizedaround the concept of a dynamic request. Operating system entry points initiaterequests to perform some action. System calls, for example, are a requestby an application to the operating system. When a fault is detected, the recoverysystem rolls back the effects of the offending recovery domain whileleaving the remainder of the system running. To ensure that the entire system(including the state of other concurrent kernel threads) remains consistentafter the rollback, dependencies between domains are tracked as the systemruns. When rolling back a faulting domain, any other domains that were dependenton the it, because of dataflowbetween the domains, are rolled backand restarted.Recovery Domains do not make faults transparent. Request failures arereported to the requester. This visibility allows handling of faults which arepermanent: those faults which would reoccur if the request were retried. RecoveryDomains also handle timing and transient faults.Recovery Domains require compiler support to instrument the system.The necessary support is simple, but can cause unnecessarily large systemoverhead. This dissertation describes several performance improvements toRecovery Domains based on dynamic analysis of the system state and staticanalysis of memory regions, allocators, and locks. Runtime analysis of theinterdependenceof the active requests can allow reduced tracking of statechanges. The recovery compiler can reason about memory regions and datastructures protected by a lock to eliminate instrumentation on many operationsto locked memory. “Fresh” heap objects, those objects which have beenallocated and have not yet become visible to other requests and threads, requireno instrumentation. These improvements to the recovery runtime andcompiler provide substantial performance improvements over more simpleimplementations.This dissertation describes the goals, approach, semantics, and programmingmodel of Recovery Domains; the minimal implementation of the runtimeand compiler; the static analysis and optimization at the compiler leveland dynamic optimization to the runtime; and the porting of two significantlydifferent versions of the Linux kernel to the recovery system. It evaluatesthe overhead, effectiveness, and coverage of recovery. Finally it describesthe potential integration of a model fault detector with the RecoveryDomains system.

【 预 览 】
附件列表
Files Size Format View
Automatic recovery for request oriented systems 1058KB PDF download
  文献评价指标  
  下载次数:16次 浏览次数:35次