学位论文详细信息
Machine Learning for Flow Cytometry Data Analysis.
Machine Learning;Flow Cytometry;Support Vector Machine;Mixture Models;EM Algorithms;Statistical File Matching;Electrical Engineering;Engineering;Electrical Engineering: Systems
Lee, GyeminNguyen, Long ;
University of Michigan
关键词: Machine Learning;    Flow Cytometry;    Support Vector Machine;    Mixture Models;    EM Algorithms;    Statistical File Matching;    Electrical Engineering;    Engineering;    Electrical Engineering: Systems;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/89818/gyemin_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

This thesis concerns the problem of automatic flow cytometry data analysis. Flow cytometryis a technique for rapid cell analysis and widely used in many biomedical and clinical laboratories. Quantitative measurements from a flow cytometer provide rich information about various physical and chemical characteristics of a large number of cells. In clinical applications, flow cytometry data is visualized on a sequence of two-dimensional scatter plots and analyzed through a manual process called ;;gating”. This conventional analysis process requires a large amount of time and labor and is highly subjective and inefficient. In this thesis, we present novel machine learning methods for flow cytometry data analysis to address these issues.We first begin by a method for generating a high dimensional flow cytometry dataset from multiple low dimensional datasets. We present an imputation algorithm based on clustering and show that it improves upon a simple nearest neighbor based approach that often induces spurious clusters in the imputed data. This technique enables the analysis of multi-dimensional flow cytometry data beyond the fundamental measurement limits of instruments.We then present two machine learning methods for automatic gating problems. Gating is a process of identifying interesting subsets of cell populations. Pathologists make clinical decisions by inspecting the results from gating. Unfortunately, this process is performed manually in most clinical settings and poses many challenges in high-throughput analysis.The first approach is an unsupervised learning technique based on multivariate mixture models. Since measurements from a flow cytometer are often censored and truncated, standard model-fitting algorithms can cause biases and lead to poor gating results. We propose novel algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored, or truncated and censored.Our second approach is a transfer learning technique combined with the low-densityseparation principle. Unlike conventional unsupervised learning approaches, this methodcan leverage existing datasets previously gated by domain experts to automatically gate anew flow cytometry data. Moreover, the proposed algorithm can adaptively account for biological variations in multiple datasets.We demonstrate these techniques on clinical flow cytometry data and evaluate theireffectiveness.

【 预 览 】
附件列表
Files Size Format View
Machine Learning for Flow Cytometry Data Analysis. 1777KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:14次