期刊论文

【摘要】

Missing data are part of almost all research and introduce an element of ambiguity into data analysis. It follows that we need to consider them appropriately in order to provide an efficient and valid analysis. In the present study, we compare 6 different imputation methods: Mean, K-nearest neighbors (KNN), fuzzy K-means (FKM), singular value decomposition (SVD), bayesian principal component analysis (bPCA) and multiple imputations by chained equations (MICE). Comparison was performed on four real datasets of various sizes (from 4 to 65 variables), under a missing completely at random (MCAR) assumption, and based on four evaluation criteria: Root mean squared error (RMSE), unsupervised classification error (UCE), supervised classification error (SCE) and execution time. Our results suggest that bPCA and FKM are two imputation methods of interest which deserve further consideration in practice.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO202307140003791ZK.pdf	1441KB	PDF	download

Journal of Biometrics & Biostatistics
A Comparison of Six Methods for Missing Data Imputation
article
PeterSchmitt¹ Jonas Mandel¹ Mickael Guedj¹
[1] Department of Bioinformatics and Biostatistics
关键词: Missing data; Imputation methods; Comparison study; Missing completely at random; bPCA;
DOI : 10.4172/2155-6180.1000224
来源: Hilaris Publisher
PDF


	文献评价指标
	下载次数：15次	浏览次数：13次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】