学位论文详细信息
Utilizing Machine Learning Techniques to Rapidly Identify MUC2 Expression in Colon Cancer Tissues
machine learning, biology, medicine, cancer, colon, AI, ML, computer science
Periyakoil, Preethi Kasthuri ; Yue, Yisong
University:California Institute of Technology
Department:Engineering and Applied Science
关键词: machine learning, biology, medicine, cancer, colon, AI, ML, computer science;   
Others  :  https://thesis.library.caltech.edu/11159/1/PreethiPeriyakoil_SeniorThesis.pdf
美国|英语
来源: Caltech THESIS
PDF
【 摘 要 】

Colorectal cancer is the third-most common form of cancer among American men and women. Like most tumors, colon cancer is sustained by a subpopulation of“stem cells” that possess the ability to self-renew and differentiate into more specialized cell types. It would be useful to detect stem cells in images of colon cancer tissue, but the first step in being able to do so is to know what genes are expressed in the stem cells and how to detect their expression pattern from the tissue images. Machine learning (ML) is a powerful tool that is widely used in biological research as a novel and innovative technique to facilitate rapid diagnosis of cancer. The current study demonstrates the feasibility and effectiveness of using ML techniques to rapidly detect the expression of the gene MUC2 (mucin 2) in colon cancer tissue images. We analyzed histological images of colon cancer and segmented the nuclei to look for features (area, perimeter, eccentricity, compactness, etc.) that correlate with high or low levels of MUC2. Grid search was then run on this data set to tune the hyper-parameters, and the following models were tested as potential classifiers: random forest, gradient boosting, decision trees with AdaBoost, and support vector machines. Of all of the tested models, it was found that the random forest classifier (f1 score of 0.71) and the gradient boosting classifier (f1 score of 0.72) were able to predict the output label most accurately. Under certain conditions, we have identified four features that have predictive capabilities. Predicting individual gene expression with machine learning is the first step in detecting genes that are specific to cancer stem cells in the early stages of cancer, while there is still hope for a cure.

【 预 览 】
附件列表
Files Size Format View
Utilizing Machine Learning Techniques to Rapidly Identify MUC2 Expression in Colon Cancer Tissues 417KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:2次