期刊论文详细信息
Frontiers in Genetics
An ensemble learning approach for diabetes prediction using boosting techniques
Genetics
Shahid Mohammad Ganie1  Hong Qin2  Majid Bashir Malik3  Saurav Mallik4  Pijush Kanti Dutta Pramanik5 
[1] AI Research Centre, School of Business, Woxsen University, Hyderabad, India;College of Engineering and Computer Science, University of Tennessee at Chattanooga, Chattanooga, TN, United States;Department of Computer Science, Baba Ghulam Shah Badshah University, Rajauri, India;Department of Environmental Health, School of Public Health, Harvard University, Boston, MA, United States;School of Computer Applications and Technology, Galgotias University, Greater Noida, India;
关键词: diabetes prediction;    ensemble learning;    XGBoost;    CatBoost;    LightGBM;    AdaBoost;    gradient boost;   
DOI  :  10.3389/fgene.2023.1252159
 received in 2023-07-03, accepted in 2023-10-16,  发布年份 2023
来源: Frontiers
PDF
【 摘 要 】

Introduction: Diabetes is considered one of the leading healthcare concerns affecting millions worldwide. Taking appropriate action at the earliest stages of the disease depends on early diabetes prediction and identification. To support healthcare providers for better diagnosis and prognosis of diseases, machine learning has been explored in the healthcare industry in recent years.Methods: To predict diabetes, this research has conducted experiments on five boosting algorithms on the Pima diabetes dataset. The dataset was obtained from the University of California, Irvine (UCI) machine learning repository, which contains several important clinical features. Exploratory data analysis was used to identify the characteristics of the dataset. Moreover, upsampling, normalisation, feature selection, and hyperparameter tuning were employed for predictive analytics.Results: The results were analysed using various statistical/machine learning metrics and k-fold cross-validation techniques. Gradient boosting achieved the greatest accuracy rate of 92.85% among all the classifiers. Precision, recall, f1-score, and receiver operating characteristic (ROC) curves were used to further validate the model.Discussion: The suggested model outperformed the current studies in terms of prediction accuracy, demonstrating its applicability to other diseases with similar predicate indications.

【 授权许可】

Unknown   
Copyright © 2023 Ganie, Pramanik, Bashir Malik, Mallik and Qin.

【 预 览 】
附件列表
Files Size Format View
RO202311148986665ZK.pdf 4735KB PDF download
  文献评价指标  
  下载次数:22次 浏览次数:4次