学位论文详细信息
Adaptive Monitoring of Complex Software Systems using Management Metrics
System management;Monitoring;Metric correlations;Diagnosis;Electrical and Computer Engineering (Software Engineering)
Munawar, Mohammad Ahmad
University of Waterloo
关键词: System management;    Monitoring;    Metric correlations;    Diagnosis;    Electrical and Computer Engineering (Software Engineering);   
Others  :  https://uwspace.uwaterloo.ca/bitstream/10012/4797/1/Munawar_Mohammad.pdf
瑞士|英语
来源: UWSPACE Waterloo Institutional Repository
PDF
【 摘 要 】

Software systems supporting networked, transaction-oriented services are large and complex; they comprise a multitude of inter-dependent layers and components, and they implement many dynamic optimization mechanisms. In addition, these systems are subject to workload that is hard to predict. These factors make monitoring these systems as well as performing problem determination challenging and costly. In this thesis we tackle these challenges with the goal of lowering the cost and improving the effectiveness of monitoring and problem determination by reducing the dependence on human operators. Specifically, this thesis presents and demonstrates the effectiveness of an efficient, automated monitoring approach which enables detection of errors and failures, and which assists in localizing faults. Software systems expose various types of monitoring data;this thesis focuses on the use of management metrics to monitor a system;;s health. We devise a system modeling approach which entails modeling stable, statistical correlations among management metrics; these correlations characterize a system;;s normal behaviourThis approach allows a system model to be built automatically and efficientlyusing the monitoring data alone. In order to control the monitoring overhead, and yet allow a system;;s health to be assessed reliably, we design an adaptive monitoring approach. This adaptive capability builds on the flexible nature of our system modeling approach, which allows the set of monitored metrics to be altered at runtime. We develop methods to automatically select management metrics to collect at the minimal monitoring level, without any domain knowledge. In addition, we devise an automated fault localization approach, which leverages the ability of the monitoring system to analyze individual metrics. Using a realistic, multi-tier software system, including different applications based on Java Enterprise Edition and industrial-strength products, we evaluate our system modeling approach. We show that stable metric correlations exist in complex software systems and that many of these correlations can be modeled using simple, efficient techniques. We investigate the effect of the collection of management metrics on system performance. We show that the monitoring overhead can be high and thus needs to be controlled. We employ fault injection experiments to evaluate the effectiveness of our adaptive monitoring and fault localization approach. We demonstrate that our approach is cost-effective, has high fault coverage and, in the majority of the cases studied, provides pertinent diagnosis information. The main contribution of this work is to show how to monitor complex software systems and determine problems in them automatically and efficiently. Our solution approach has wide applicability and the techniques we use are simple and yet effective. Our work suggests that the cost of monitoring software systems is not necessarily a function of their complexity, providing hope that the health of increasingly large and complex systems can be tracked with a limited amount of human resources and without sacrificing much system performance.

【 预 览 】
附件列表
Files Size Format View
Adaptive Monitoring of Complex Software Systems using Management Metrics 1576KB PDF download
  文献评价指标  
  下载次数:26次 浏览次数:44次