期刊论文详细信息
BMC Bioinformatics
Analysis of single-cell RNA sequencing data based on autoencoders
Pietro Liò1  Federico Ricciuti2  Daniela Besozzi3  Ana Cvejic4  Andrea Tangherloni5 
[1] Department of Computer Science and Technology, University of Cambridge, CB3 0FD, Cambridge, UK;Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Italy;Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Italy;Bicocca Bioinformatics, Biostatistics and Bioimaging Centre (B4), Milan, Italy;Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, CB2 0AW, Cambridge, UK;Department of Haematology, University of Cambridge, CB2 0AW, Cambridge, UK;Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, UK;Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, CB2 0AW, Cambridge, UK;Department of Haematology, University of Cambridge, CB2 0AW, Cambridge, UK;Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, UK;Department of Human and Social Sciences, University of Bergamo, 24129, Bergamo, Italy;
关键词: Autoencoders;    scRNA-Seq;    Dimensionality reduction;    Clustering;    Batch correction;    Data integration;   
DOI  :  10.1186/s12859-021-04150-3
来源: Springer
PDF
【 摘 要 】

BackgroundSingle-cell RNA sequencing (scRNA-Seq) experiments are gaining ground to study the molecular processes that drive normal development as well as the onset of different pathologies. Finding an effective and efficient low-dimensional representation of the data is one of the most important steps in the downstream analysis of scRNA-Seq data, as it could provide a better identification of known or putatively novel cell-types. Another step that still poses a challenge is the integration of different scRNA-Seq datasets. Though standard computational pipelines to gain knowledge from scRNA-Seq data exist, a further improvement could be achieved by means of machine learning approaches.ResultsAutoencoders (AEs) have been effectively used to capture the non-linearities among gene interactions of scRNA-Seq data, so that the deployment of AE-based tools might represent the way forward in this context. We introduce here scAEspy, a unifying tool that embodies: (1) four of the most advanced AEs, (2) two novel AEs that we developed on purpose, (3) different loss functions. We show that scAEspy can be coupled with various batch-effect removal tools to integrate data by different scRNA-Seq platforms, in order to better identify the cell-types. We benchmarked scAEspy against the most used batch-effect removal tools, showing that our AE-based strategies outperform the existing solutions.ConclusionsscAEspy is a user-friendly tool that enables using the most recent and promising AEs to analyse scRNA-Seq data by only setting up two user-defined parameters. Thanks to its modularity, scAEspy can be easily extended to accommodate new AEs to further improve the downstream analysis of scRNA-Seq data. Considering the relevant results we achieved, scAEspy can be considered as a starting point to build a more comprehensive toolkit designed to integrate multi single-cell omics.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202107225449268ZK.pdf 5098KB PDF download
  文献评价指标  
  下载次数:14次 浏览次数:16次