期刊论文详细信息
BMC Research Notes
Disease prediction via Bayesian hyperparameter optimization and ensemble learning
Liyuan Gao1  Yongmei Ding1 
[1] College of Science, Wuhan University of Science and Technology;
关键词: Hyperparameter optimization;    Feature selection;    Ensemble learning;    Gain;   
DOI  :  10.1186/s13104-020-05050-0
来源: DOAJ
【 摘 要 】

Abstract Objective Early disease screening and diagnosis are important for improving patient survival. Thus, identifying early predictive features of disease is necessary. This paper presents a comprehensive comparative analysis of different Machine Learning (ML) systems and reports the standard deviation of the results obtained through sampling with replacement. The research emphasises on: (a) to analyze and compare ML strategies used to predict Breast Cancer (BC) and Cardiovascular Disease (CVD) and (b) to use feature importance ranking to identify early high-risk features. Results The Bayesian hyperparameter optimization method was more stable than the grid search and random search methods. In a BC diagnosis dataset, the Extreme Gradient Boosting (XGBoost) model had an accuracy of 94.74% and a sensitivity of 93.69%. The mean value of the cell nucleus in the Fine Needle Puncture (FNA) digital image of breast lump was identified as the most important predictive feature for BC. In a CVD dataset, the XGBoost model had an accuracy of 73.50% and a sensitivity of 69.54%. Systolic blood pressure was identified as the most important feature for CVD prediction.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:1次