期刊论文详细信息
PeerJ
A combined test for feature selection on sparse metaproteomics data—an alternative to missing value imputation
article
Sandra Plancade1  Magali Berland2  Mélisande Blein-Nicolas2  Olivier Langella2  Ariane Bassignani2  Catherine Juste3 
[1] UR875 MIAT, Université fédérale de Toulouse;Université Paris-Saclay;Micalis Institute, Université Paris-Saclay
关键词: Metaproteomics;    Feature selection;    Missing value imputation;    Combined test;   
DOI  :  10.7717/peerj.13525
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

One of the difficulties encountered in the statistical analysis of metaproteomics data is the high proportion of missing values, which are usually treated by imputation. Nevertheless, imputation methods are based on restrictive assumptions regarding missingness mechanisms, namely “at random” or “not at random”. To circumvent these limitations in the context of feature selection in a multi-class comparison, we propose a univariate selection method that combines a test of association between missingness and classes, and a test for difference of observed intensities between classes. This approach implicitly handles both missingness mechanisms. We performed a quantitative and qualitative comparison of our procedure with imputation-based feature selection methods on two experimental data sets, as well as simulated data with various scenarios regarding the missingness mechanisms and the nature of the difference of expression (differential intensity or differential presence). Whereas we observed similar performances in terms of prediction on the experimental data set, the feature ranking and selection from various imputation-based methods were strongly divergent. We showed that the combined test reaches a compromise by correlating reasonably with other methods, and remains efficient in all simulated scenarios unlike imputation-based feature selection methods.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100003835ZK.pdf 4623KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:0次