期刊论文详细信息
Frontiers in Genetics
Subject clustering by IF-PCA and several recent methods
Genetics
Jiashun Jin1  Zheng Tracy Ke2  Dieyi Chen2 
[1] Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, United States;Department of Statistics, Harvard University, Cambridge, MA, United States;
关键词: gene microarray;    feature selection;    higher criticism threshold;    PCA;    ScRNA-seq;    sparsity;    subject clustering;    variational;   
DOI  :  10.3389/fgene.2023.1166404
 received in 2023-02-15, accepted in 2023-05-03,  发布年份 2023
来源: Frontiers
PDF
【 摘 要 】

Subject clustering (i.e., the use of measured features to cluster subjects, such as patients or cells, into multiple groups) is a problem of significant interest. In recent years, many approaches have been proposed, among which unsupervised deep learning (UDL) has received much attention. Two interesting questions are 1) how to combine the strengths of UDL and other approaches and 2) how these approaches compare to each other. We combine the variational auto-encoder (VAE), a popular UDL approach, with the recent idea of influential feature-principal component analysis (IF-PCA) and propose IF-VAE as a new method for subject clustering. We study IF-VAE and compare it with several other methods (including IF-PCA, VAE, Seurat, and SC3) on 10 gene microarray data sets and eight single-cell RNA-seq data sets. We find that IF-VAE shows significant improvement over VAE, but still underperforms compared to IF-PCA. We also find that IF-PCA is quite competitive, slightly outperforming Seurat and SC3 over the eight single-cell data sets. IF-PCA is conceptually simple and permits delicate analysis. We demonstrate that IF-PCA is capable of achieving phase transition in a rare/weak model. Comparatively, Seurat and SC3 are more complex and theoretically difficult to analyze (for these reasons, their optimality remains unclear).

【 授权许可】

Unknown   
Copyright © 2023 Chen, Jin and Ke.

【 预 览 】
附件列表
Files Size Format View
RO202310109290007ZK.pdf 1003KB PDF download
  文献评价指标  
  下载次数:0次 浏览次数:0次