会议论文详细信息
Joint Conference on Green Engineering Technology & Applied Computing 2019
Empirical Performance Evaluation of Imputation Techniques using Medical Dataset
工业技术(总论);计算机科学
Alade, O.A.^1^2 ; Sallehuddin, R.^1 ; Selamat, A.^1
School of Computing, Universiti Teknologi Malaysia, Skudai, Johor
81310, Malaysia^1
Computer Science Department, Federal Polytechnic, Bida
912101, Nigeria^2
关键词: Density distributions;    Empirical performance;    Imputation techniques;    K-nearest neighbours;    Missing not at random;    Missing value imputation;    Missingness mechanism;    Root mean square errors;   
Others  :  https://iopscience.iop.org/article/10.1088/1757-899X/551/1/012055/pdf
DOI  :  10.1088/1757-899X/551/1/012055
来源: IOP
PDF
【 摘 要 】

This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predictive Mean Matching Imputation (PMMI) and K-Nearest Neighbour Imputation (KNNI) techniques were implemented on imputation of fertility dataset. The implementation was on three mechanisms of missing values: Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). The results were evaluated by mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE). PMMI performed better than KNNI in all the results. MSE for example, has the ratio of 0.0260/2.8555 (PMMI/KNNI) for 1-10% MAR - 99.09% reduced error rate; 0.1108/3.0120 (PMMI/KNNI) for 30-40% MCAR - 96.32 reduced error rate; and 0.0642/3.7187 (PMMI/KNNI) for 40-50% MNAR - 98.27% reduced error rate. MCAR was the most consistent missingness mechanism for the evaluations. Density distributions of the imputed dataset were compared with the original dataset. The distribution plots of the imputed missing data followed the curve of the original dataset.

【 预 览 】
附件列表
Files Size Format View
Empirical Performance Evaluation of Imputation Techniques using Medical Dataset 606KB PDF download
  文献评价指标  
  下载次数:19次 浏览次数:44次