期刊论文详细信息
Frontiers in Bioinformatics
Gene representation in scRNA-seq is correlated with common motifs at the 3′ end of transcripts
Bioinformatics
Greg Gibson1  Peng Qiu2  Xinling Li2 
[1] School of Biological Sciences, and Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA, United States;The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, United States;
关键词: 10X;    single-cell RNA sequencing;    bulk RNA-seq;    data integration;    comparison;    dropouts;    pathway analysis;    motif discovery;   
DOI  :  10.3389/fbinf.2023.1120290
 received in 2022-12-09, accepted in 2023-05-02,  发布年份 2023
来源: Frontiers
PDF
【 摘 要 】

One important characteristic of single-cell RNA sequencing (scRNA-seq) data is its high sparsity, where the gene-cell count data matrix contains high proportion of zeros. The sparsity has motivated widespread discussions on dropouts and missing data, as well as imputation algorithms of scRNA-seq analysis. Here, we aim to investigate whether there exist genes that are more prone to be under-detected in scRNA-seq, and if yes, what commonalities those genes may share. From public data sources, we gathered paired bulk RNA-seq and scRNA-seq data from 53 human samples, which were generated in diverse biological contexts. We derived pseudo-bulk gene expression by averaging the scRNA-seq data across cells. Comparisons of the paired bulk and pseudo-bulk gene expression profiles revealed that there indeed exists a collection of genes that are frequently under-detected in scRNA-seq compared to bulk RNA-seq. This result was robust to randomization when unpaired bulk and pseudo-bulk gene expression profiles were compared. We performed motif search to the last 350 bp of the identified genes, and observed an enrichment of poly(T) motif. The poly(T) motif toward the tails of those genes may be able to form hairpin structures with the poly(A) tails of their mRNA transcripts, making it difficult for their mRNA transcripts to be captured during scRNA-seq library preparation, which is a mechanistic conjecture of why certain genes may be more prone to be under-detected in scRNA-seq.

【 授权许可】

Unknown   
Copyright © 2023 Li, Gibson and Qiu.

【 预 览 】
附件列表
Files Size Format View
RO202310102239496ZK.pdf 1668KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:0次