期刊论文详细信息
BMC Bioinformatics
QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments
Stephen W. Hartley1  James C. Mullikin1 
[1] Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda 20892, MD, USA
关键词: Differential splicing;    Differential transcript regulation;    Differential expression;    Next-generation sequencing;    RNA-Seq;    Quality Control;   
Others  :  1230725
DOI  :  10.1186/s12859-015-0670-5
 received in 2015-05-26, accepted in 2015-07-09,  发布年份 2015
【 摘 要 】

Background

High-throughput next-generation RNA sequencing has matured into a viable and powerful method for detecting variations in transcript expression and regulation. Proactive quality control is of critical importance as unanticipated biases, artifacts, or errors can potentially drive false associations and lead to flawed results.

Results

We have developed the Quality of RNA-Seq Toolset, or QoRTs, a comprehensive, multifunction toolset that assists in quality control and data processing of high-throughput RNA sequencing data.

Conclusions

QoRTs generates an unmatched variety of quality control metrics, and can provide cross-comparisons of replicates contrasted by batch, biological sample, or experimental condition, revealing any outliers and/or systematic issues that could drive false associations or otherwise compromise downstream analyses. In addition, QoRTs simultaneously replaces the functionality of numerous other data-processing tools, and can quickly and efficiently generate quality control metrics, coverage counts (for genes, exons, and known/novel splice-junctions), and browser tracks. These functions can all be carried out as part of a single unified data-processing/quality control run, greatly reducing both the complexity and the total runtime of the analysis pipeline. The software, source code, and documentation are available online at http://hartleys.github.io/QoRTs.

【 授权许可】

   
2015 Hartley and Mullikin.

附件列表
Files Size Format View
Fig. 3. 57KB Image download
Fig. 2. 110KB Image download
Fig. 1. 54KB Image download
Fig. 3. 57KB Image download
Fig. 2. 110KB Image download
Fig. 1. 54KB Image download
【 图 表 】

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 1.

Fig. 2.

Fig. 3.

【 参考文献 】
  • [1]Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009; 10(1):57-63.
  • [2]Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106.
  • [3]Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010; 11(3):R25. BioMed Central Full Text
  • [4]Robinson MD, Smyth GK. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics. 2007; 23(21):2881-7.
  • [5]Hansen KD, Irizarry RA, Wu Z. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics. 2012; 13(2):204-16.
  • [6]Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 2011; 12(3):R22. BioMed Central Full Text
  • [7]Wang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012; 28(16):2184-5.
  • [8]DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C et al.. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics. 2012; 28(11):1530-2.
  • [9]Andrews S. FastQC: A quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 20 May 2015.
  • [10]Yang X, Liu D, Liu F, Wu J, Zou J, Xiao X et al.. HTQC: a fast quality control toolkit for Illumina sequencing data. BMC bioinformatics. 2013; 14:33. BioMed Central Full Text
  • [11]Hartley SW. QoRTs: Quality of RNA-Seq Toolset. http://hartleys.github.io/QoRTs/. Accessed 20 May 2015.
  • [12]The Broad Institute. Picard. http://broadinstitute.github.io/picard/. Accessed 20 May 2015.
  • [13]Sebastiani P, Solovieff N, Puca A, Hartley SW, Melista E, Andersen S et al.. Retraction. Science. 2011; 333(6041):404.
  • [14]Retraction notice to: Cell adhesion-dependent control of microRNA decay. Molecular Cell 43, 1005–1014; September 16, 2011. Molecular cell. 2012;46(6):896.
  • [15]Li M, Wang IX, Li Y, Bruzel A, Richards AL, Toung JM et al.. Widespread RNA and DNA sequence differences in the human transcriptome. Science. 2011; 333(6038):53-8.
  • [16]Lin W, Piskol R, Tan MH, Li JB. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science. 2012;335(6074):1302; author reply doi:10.1126/science.1210624.
  • [17]Kleinman CL, Majewski J. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science. 2012;335(6074):1302; author reply doi:10.1126/science.1209658.
  • [18]Pickrell JK, Gilad Y, Pritchard JK. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science. 2012;335(6074):1302; author reply doi:10.1126/science.1210484.
  • [19]Schrider DR, Gout JF, Hahn MW. Very few RNA and DNA sequence differences in the human transcriptome. PloS one. 2011; 6(10): Article ID e25842
  • [20]Sebastiani P, Solovieff N, Dewan AT, Walsh KM, Puca A, Hartley SW et al.. Genetic signatures of exceptional longevity in humans. PloS one. 2012; 7(1): Article ID e29848
  • [21]Ager-Wick E, Henkel CV, Haug TM, Weltzien FA. Using normalization to resolve RNA-Seq biases caused by amplification from minimal input. Physiol Genom. 2014; 46(21):808-20.
  • [22]Anders S, Reyes A, Huber W. Detecting differential usage of exons from RNA-seq data. Genome res. 2012; 22(10):2008-17.
  • [23]Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010; 26(1):139-40.
  • [24]Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE et al.. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 2010; 11(10):733-9.
  • [25]Hartley SW. The QoRTs User Manual. http://hartleys.github.io/QoRTs/doc/QoRTs-vignette.pdf. Accessed 20 May 2015.
  • [26]Anders S, Pyl PT, Huber W. HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015; 31(2):166-9.
  • [27]Lawrence M, Huber W, Pages H, Aboyoun P, Carlson M, Gentleman R et al.. Software for computing and annotating genomic ranges. PLoS computational biology. 2013; 9(8): Article ID e1003118
  • [28]Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM et al. The human genome browser at UCSC. Genome research. 2002;12(6):996–1006. doi:10.1101/gr.229102. Article published online before print in May 2002.
  • [29]Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G et al.. Integrative genomics viewer. Nat Biotechnol. 2011; 29(1):24-6.
  • [30]Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in bioinformatics. 2013; 14(2):178-92.
  • [31]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al.. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25(16):2078-9.
  文献评价指标  
  下载次数:102次 浏览次数:31次