Joint Conference on Green Engineering Technology & Applied Computing 2019 | |
Empirical Performance Evaluation of Imputation Techniques using Medical Dataset | |
工业技术(总论);计算机科学 | |
Alade, O.A.^1^2 ; Sallehuddin, R.^1 ; Selamat, A.^1 | |
School of Computing, Universiti Teknologi Malaysia, Skudai, Johor | |
81310, Malaysia^1 | |
Computer Science Department, Federal Polytechnic, Bida | |
912101, Nigeria^2 | |
关键词: Density distributions; Empirical performance; Imputation techniques; K-nearest neighbours; Missing not at random; Missing value imputation; Missingness mechanism; Root mean square errors; | |
Others : https://iopscience.iop.org/article/10.1088/1757-899X/551/1/012055/pdf DOI : 10.1088/1757-899X/551/1/012055 |
|
来源: IOP | |
【 摘 要 】
This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predictive Mean Matching Imputation (PMMI) and K-Nearest Neighbour Imputation (KNNI) techniques were implemented on imputation of fertility dataset. The implementation was on three mechanisms of missing values: Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). The results were evaluated by mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE). PMMI performed better than KNNI in all the results. MSE for example, has the ratio of 0.0260/2.8555 (PMMI/KNNI) for 1-10% MAR - 99.09% reduced error rate; 0.1108/3.0120 (PMMI/KNNI) for 30-40% MCAR - 96.32 reduced error rate; and 0.0642/3.7187 (PMMI/KNNI) for 40-50% MNAR - 98.27% reduced error rate. MCAR was the most consistent missingness mechanism for the evaluations. Density distributions of the imputed dataset were compared with the original dataset. The distribution plots of the imputed missing data followed the curve of the original dataset.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Empirical Performance Evaluation of Imputation Techniques using Medical Dataset | 606KB | download |