期刊论文详细信息
Journal of computational biology: A journal of computational molecular cell biology
Machine Learning-Based Method for Obesity Risk Evaluation Using Single-Nucleotide Polymorphisms Derived from Next-Generation Sequencing
Jang-JihLu^8,1,31  Bo-YuChu^72  Szu-HsienChiang^13  Hsin-YaoWang^1,24  Wan-YingLin^45  Kai-YaoHuang^66  Chun-HsienChen^57  Shih-ChengChang^1,38  Tzong-YiLee^9,7,10,119 
[1] Address correspondence to: Prof. Jang-Jih Lu, Department of Laboratory Medicine, Chang Gung Memorial Hospital, 333, No. 5, Fuxing St, Guishan District, Taoyuan City, Taiwan^8;Department of Computer Science and Engineering, Yuan Ze University, Taoyuan City, Taiwan^7;Department of Information Management, Chang Gung University, Taoyuan City, Taiwan^5;Department of Laboratory Medicine, Chang Gung Memorial Hospital, Taoyuan City, Taiwan^1;Department of Medical Biotechnology and Laboratory Science, College of Medicine, Chang Gung University, Taoyuan City, Taiwan^3;Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, Taiwan^6;Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Taoyuan City, Taiwan^4;Ph.D. Program in Biomedical Engineering, Chang Gung University, Taoyuan City, Taiwan^2;Prof. Tzong-Yi Lee, Department of Computer Science and Engineering, Yuan Ze University, 135 Yuan-Tung Road, Chungli, Taoyuan City 32003, Taiwan^9;School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China^11;Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, China^10
关键词: machine learning;    next-generation sequencing (NGS);    obesity;    single-nucleotide polymorphisms (SNPs);   
DOI  :  10.1089/cmb.2018.0002
学科分类:生物科学(综合)
来源: Mary Ann Liebert, Inc. Publishers
PDF
【 摘 要 】

Obesity is a major risk factor for many metabolic diseases. To understand the genetic characteristics of obese individuals, single-nucleotide polymorphisms (SNPs) derived from next-generation sequencing (NGS) provide comprehensive insight into genome-wide genetic investigation. However, interpretation of these SNP data for clinical application is difficult given the high complexity of NGS data. Hence, in this study, obesity risk prediction models based on SNPs were designed using machine learning (ML) methods, namely support vector machine (SVM), k-nearest neighbor, and decision tree (DT). This investigation obtained clinicopathological features, including 130 SNPs, sex, and age, from 139 eligible individuals. Various feature selection methods, such as stepwise multivariate linear regression (MLR), DT, and genetic algorithms, were applied to select informative features for generating obesity prediction models. Multivariate logistic regression was used to evaluate the importance of the selected features. The models trained from various features evaluated their predictive performances based on fivefold cross-validation. Three measures, namely accuracy, sensitivity, and specificity, were used to examine and compare the predictive power among various models. To design obesity prediction models using ML methods, nine SNPs, including rs10501087, rs17700144, rs2287019, rs534870, rs660339, rs7081678, rs718314, rs9816226, and rs984222, were selected based on stepwise MLR. In evaluation of model performance, the SVM model significantly outperformed other classifiers based on the same training features. The SVM model exhibits 70.77% accuracy, 80.09% sensitivity, and 63.02% specificity. This investigation has demonstrated that the selected SNPs were effective in the detection of obesity risk. Additionally, the ML-based method provides a feasible mean for conducting preliminary analyses of genetic characteristics of obesity.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201910252708409ZK.pdf 575KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:4次