| Algorithms | |
| Seminal Quality Prediction Using Clustering-Based Decision Forests | |
| Hong Wang1  Qingsong Xu1  | |
| [1] School of Mathematics & Statistics, Central South University, Changsha, Hunan, 410075, China; | |
| 关键词: seminal prediction; imbalanced learning; variable importance; | |
| DOI : 10.3390/a7030405 | |
| 来源: mdpi | |
PDF
|
|
【 摘 要 】
Prediction of seminal quality with statistical learning tools is an emerging methodology in decision support systems in biomedical engineering and is very useful in early diagnosis of seminal patients and selection of semen donors candidates. However, as is common in medical diagnosis, seminal quality prediction faces the class imbalance problem. In this paper, we propose a novel supervised ensemble learning approach, namely Clustering-Based Decision Forests, to tackle unbalanced class learning problem in seminal quality prediction. Experiment results on real fertility diagnosis dataset have shown that Clustering-Based Decision Forests outperforms decision tree, Support Vector Machines, random forests, multilayer perceptron neural networks and logistic regression by a noticeable margin. Clustering-Based Decision Forests can also be used to evaluate variables’ importance and the top five important factors that may affect semen concentration obtained in this study are age, serious trauma, sitting time, the season when the semen sample is produced, and high fevers in the last year. The findings could be helpful in explaining seminal concentration problems in infertile males or pre-screening semen donor candidates.
【 授权许可】
CC BY
© 2014 by the authors; licensee MDPI, Basel, Switzerland.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202003190023143ZK.pdf | 441KB |
PDF