As software is growing in size and complexity, accompanied by vendors’ increased time-to-market pressure, it has become increasingly difficult to deliver bulletproof software. Consequently, software systems still fail in the production environment.Once a failure occurs in production systems, it is important for the vendors to trouble-shoot it as quickly as possible since these failures directly affect the customers. Consequently vendors typically invest significant amounts of resources in production failure diagnosis. Unfortunately, diagnosing these production failures is notoriously difficult. Indeed, constrained by both privacyand expense reasons, software vendors often cannot reproduce such failures. Therefore, supportengineers and developers continue to rely on the logs printed by the run time system to diagnosethe production failures. Unfortunately, the current failure diagnosis experience with log messages, which is colloquially referred as “printf-debugging”, is far from pleasant. First, such diagnosis requires expert knowledgeand is also too time-consuming, tedious to narrow down root causes. Second, the ad-hoc nature of the log messages is frequently insufficient for detailed failure diagnosis.This dissertation makes three main contributions towards improving the diagnosis of production failures. The first contribution is a practical technique to automatically analyze the log messages and the source code to help the programmers debugging the failure. Given a productionsoftware failure and its printed log messages, programmers need to first map each log message tothe source code statement and manually work backwards to infer what possible conditions might have led to the failure. This manual detective work to find the cause of the failure remains a tedious and error-prone process. To address this problem, this dissertation designs and evaluates a technique,named SherLog, that analyzes source code by leveraging information provided by run-time logs to reconstruct what must or may have happened during the failed production run.The second contribution is the understanding of the quality of log messages for failure diagnosis. The efficacy of failure diagnosis, either manually or with automatic log inference tools such as SherLog, is predicated by the quality of log messages. However, there is little empirical data about how well existing logging practices work and how they can yet be improved. This dissertation provides the first comprehensive study on the logging effectiveness. By examining developers own modifications to their logging code in the revision history, this study found that developers often do not make the log messages right in their first attempts, and thus need to spend a significant amount of efforts tomodify the log messages as after-thoughts. It further provides many interesting findings on where programmers spend most of their efforts in modifying the log messages.The third main contribution of this dissertation is to improve the quality of the log messages,which is informed and inspired by the characteristic study, to enhance postmortem failure diagnosis. In particular, this dissertation invents two techniques: LogEnhancer and Errlog for this purpose. LogEnhancer is motivated by a simple observation from our characteristic study: logmessages often do not contain sufficient information for diagnosis. LogEnhancer solves this problem by systematically and automatically “enhance” each existing log printing statement in the source code by collecting additional informative variable values. However, LogEnhancer does not insert new log messages. The second technique, Errlog, is to insert new log messages into the program to significantly reduce the diagnosis time with only negligible logging overhead penalty. This technique is driven by an empirical study on 250 real-world failures. A controlled user study suggests that Errlog and LogEnhancer can effectively cut the diagnosis time by 60.7%.
【 预 览 】
附件列表
Files
Size
Format
View
Improving failure diagnosis via better design and analysis of log messages