期刊论文详细信息
BMC Bioinformatics
Evaluation of methods for differential expression analysis on multi-group RNA-seq count data
Research Article
Jianqiang Sun1  Kentaro Shimizu1  Min Tang1  Koji Kadota1 
[1] Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, 113-8657, Tokyo, Japan;
关键词: Count Data;    Differential Expression;    Gene Ranking;    False Discovery Rate Threshold;    Rank Gene List;   
DOI  :  10.1186/s12859-015-0794-7
 received in 2015-03-28, accepted in 2015-10-24,  发布年份 2015
来源: Springer
PDF
【 摘 要 】

BackgroundRNA-seq is a powerful tool for measuring transcriptomes, especially for identifying differentially expressed genes or transcripts (DEGs) between sample groups. A number of methods have been developed for this task, and several evaluation studies have also been reported. However, those evaluations so far have been restricted to two-group comparisons. Accumulations of comparative studies for multi-group data are also desired.MethodsWe compare 12 pipelines available in nine R packages for detecting differential expressions (DE) from multi-group RNA-seq count data, focusing on three-group data with or without replicates. We evaluate those pipelines on the basis of both simulation data and real count data.ResultsAs a result, the pipelines in the TCC package performed comparably to or better than other pipelines under various simulation scenarios. TCC implements a multi-step normalization strategy (called DEGES) that internally uses functions provided by other representative packages (edgeR, DESeq2, and so on). We found considerably different numbers of identified DEGs (18.5 ~ 45.7 % of all genes) among the pipelines for the same real dataset but similar distributions of the classified expression patterns. We also found that DE results can roughly be estimated by the hierarchical dendrogram of sample clustering for the raw count data.ConclusionWe confirmed the DEGES-based pipelines implemented in TCC performed well in a three-group comparison as well as a two-group comparison. We recommend using the DEGES-based pipeline that internally uses edgeR (here called the EEE-E pipeline) for count data with replicates (especially for small sample sizes). For data without replicates, the DEGES-based pipeline with DESeq2 (called SSS-S) can be recommended.

【 授权许可】

CC BY   
© Tang et al. 2015

【 预 览 】
附件列表
Files Size Format View
RO202311092660190ZK.pdf 928KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  文献评价指标  
  下载次数:2次 浏览次数:0次