学位论文详细信息
Kernel Methods for Classification with Irregularly Sampled and Contaminated Data.
Classification;Kernel Methods;Contaminated Data;Machine Learning;Computer Science;Electrical Engineering;Engineering;Electrical Engineering: Systems
Kim, Joo SeukZhu, Ji ;
University of Michigan
关键词: Classification;    Kernel Methods;    Contaminated Data;    Machine Learning;    Computer Science;    Electrical Engineering;    Engineering;    Electrical Engineering: Systems;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/89858/stannum_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Design of a classifier consists of two stages: feature extraction and classifier learning. For a better performance, the nature, characteristics, or underlying structure of data should be taken into account in either of the stages when we design a classifier. In this thesis, we present kernel methods for classification with irregularly sampled and contaminated data.First, we propose a feature extraction method for irregularly sampled data. Irregularly sampled data often arises in medical applications where the vital signs of patients are monitored based on the severity of their condition and the availability of nursing staff. In particular, we consider an ICU (intensive care unit) admission prediction problem for a post-operative patient with possible sepsis. The experimental results show that the proposed features, when paired with kernel methods, have more discriminating power than those used by clinicians.Second, we consider one-class classification problem with contaminated data, where the majority of the data comes from a ;;nominal;; distribution with a small fraction of the data coming from an outlying distribution. We deal with this problem by robustly estimating the nominal density (or a level set thereof) from the contaminated data. Our proposed density estimation achieves robustness by combinining a traditional kernel density estimator (KDE) with ideas from classical M-estimation. The robustness of the density estimator is demonstrated with a representer theorem, the influence function, and experimental results.Third, we propose a kernel classifier that optimizes the L_2 distances between ;;difference of densities;;. Like a support vector machine (SVM), the classifier is sparse and results from solving a quadratic program. We also provide statistical performance guarantees for the proposed L_2 kernel classifier in the form of a finite sample oracle inequality, and strong consistency in the sense of both ISE and probability of error.

【 预 览 】
附件列表
Files Size Format View
Kernel Methods for Classification with Irregularly Sampled and Contaminated Data. 784KB PDF download
  文献评价指标  
  下载次数:26次 浏览次数:102次