会议论文详细信息
6th Symposium on Operating Systems Design & Implementation
Understanding and Dealing with Operator Mistakes in Internet Services
Kiran Nagaraja ; Fa´bio Oliveira ; Ricardo Bianchini ; Richard P. Martin ; Thu D. Nguyen
PID  :  75317
来源: CEUR
PDF
【 摘 要 】

Operator mistakes are a significant source of unavailabil ity in modern Internet services. In this paper, we first characterize these mistakes by performing an extensive set of experiments using human operators and a realis tic threetier auction service. The mistakes we observed range from software misconfiguration, to fault misdiag nosis, to incorrect software restarts. We next propose to validate operator actions before they are made visi ble to the rest of the system. We demonstrate how to accomplish this task via the creation of a validation envi ronment that is an extension of the online system, where components can be validated using real workloads before they are migrated into the running service. We show that our prototype validation system can detect 66% of the

【 预 览 】
附件列表
Files Size Format View
Understanding and Dealing with Operator Mistakes in Internet Services 431KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:19次