Frontiers in Digital Health | |
Identifying potential circulating miRNA biomarkers for the diagnosis and prediction of ovarian cancer using machine-learning approach: application of Boruta | |
Digital Health | |
Hanif Yaghoobi1  Esmaeil Babaei2  Reza Arabi Belaghi3  Jamileh Malakouti4  Parvin Sarbakhsh5  Farzaneh Hamidi5  Neda Gilani6  | |
[1] Department of Biological Sciences, School of Natural Sciences, University of Tabriz, Tabriz, Iran;Department of Biological Sciences, School of Natural Sciences, University of Tabriz, Tabriz, Iran;Interfaculty Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Tübingen, Germany;Department of Mathematics, Applied Mathematics and Statistics, Uppsala University, Uppsala, Sweden;Department of Statistics, Faculty of Mathematical Science, University of Tabriz, Tabriz, Iran;Department of Energy and Technology, Swedish Agricultural University, Uppsala, Sweden;Department of Midwifery, Faculty of Nursing and Midwifery, Tabriz University of Medical Science, Tabriz, Iran;Department of Statistics and Epidemiology, Faculty of Health, Tabriz University of Medical Sciences, Tabriz, Iran;Department of Statistics and Epidemiology, Faculty of Health, Tabriz University of Medical Sciences, Tabriz, Iran;Road Traffic Injury Research Center, Tabriz University of Medical Sciences, Tabriz, Iran; | |
关键词: artificial intelligence; Boruta; biomarker; feature selection; Gene Expression Omnibus; ovarian cancer; oncology; | |
DOI : 10.3389/fdgth.2023.1187578 | |
received in 2023-03-16, accepted in 2023-07-20, 发布年份 2023 | |
来源: Frontiers | |
【 摘 要 】
IntroductionIn gynecologic oncology, ovarian cancer is a great clinical challenge. Because of the lack of typical symptoms and effective biomarkers for noninvasive screening, most patients develop advanced-stage ovarian cancer by the time of diagnosis. MicroRNAs (miRNAs) are a type of non-coding RNA molecule that has been linked to human cancers. Specifying diagnostic biomarkers to determine non-cancer and cancer samples is difficult.MethodsBy using Boruta, a novel random forest-based feature selection in the machine-learning techniques, we aimed to identify biomarkers associated with ovarian cancer using cancerous and non-cancer samples from the Gene Expression Omnibus (GEO) database: GSE106817. In this study, we used two independent GEO data sets as external validation, including GSE113486 and GSE113740. We utilized five state-of-the-art machine-learning algorithms for classification: logistic regression, random forest, decision trees, artificial neural networks, and XGBoost.ResultsFour models discovered in GSE113486 had an AUC of 100%, three in GSE113740 with AUC of over 94%, and four in GSE113486 with AUC of over 94%. We identified 10 miRNAs to distinguish ovarian cancer cases from normal controls: hsa-miR-1290, hsa-miR-1233-5p, hsa-miR-1914-5p, hsa-miR-1469, hsa-miR-4675, hsa-miR-1228-5p, hsa-miR-3184-5p, hsa-miR-6784-5p, hsa-miR-6800-5p, and hsa-miR-5100. Our findings suggest that miRNAs could be used as possible biomarkers for ovarian cancer screening, for possible intervention.
【 授权许可】
Unknown
© 2023 Hamidi, Gilani, Arabi Belaghi, Yaghoobi, Babaei, Sarbakhsh and Malakouti.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202310108742393ZK.pdf | 9380KB | download |