期刊论文详细信息
BMC Bioinformatics
trieFinder: an efficient program for annotating Digital Gene Expression (DGE) tags
Shawn M Burgess3  Tyra G Wolfsberg1  Jin Liang2  Matthew C LaFave3  Gabriel Renaud1 
[1]Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-8004, USA
[2]Department of Biological Statistics and Computational Biology, Weil Institute for Cell and Molecular Biology, Ithaca, NY 14853-7202, USA
[3]Translational and Functional Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-8004, USA
关键词: SAGE;    DGE;    Transcriptional profiling;    RNA-Seq;   
Others  :  1085512
DOI  :  10.1186/1471-2105-15-329
 received in 2014-08-07, accepted in 2014-09-05,  发布年份 2014
PDF
【 摘 要 】

Background

Quantification of a transcriptional profile is a useful way to evaluate the activity of a cell at a given point in time. Although RNA-Seq has revolutionized transcriptional profiling, the costs of RNA-Seq are still significantly higher than microarrays, and often the depth of data delivered from RNA-Seq is in excess of what is needed for simple transcript quantification. Digital Gene Expression (DGE) is a cost-effective, sequence-based approach for simple transcript quantification: by sequencing one read per molecule of RNA, this technique can be used to efficiently count transcripts while obviating the need for transcript-length normalization and reducing the total numbers of reads necessary for accurate quantification. Here, we present trieFinder, a program specifically designed to rapidly map, parse, and annotate DGE tags of various lengths against cDNA and/or genomic sequence databases.

Results

The trieFinder algorithm maps DGE tags in a two-step process. First, it scans FASTA files of RefSeq, UniGene, and genomic DNA sequences to create a database of all tags that can be derived from a predefined restriction site. Next, it compares the experimental DGE tags to this tag database, taking advantage of the fact that the tags are stored as a prefix tree, or “trie”, which allows for linear-time searches for exact matches. DGE tags with mismatches are analyzed by recursive calls in the data structure. We find that, in terms of alignment speed, the mapping functionality of trieFinder compares favorably with Bowtie.

Conclusions

trieFinder can quickly provide the user an annotation of the DGE tags from three sources simultaneously, simplifying transcript quantification and novel transcript detection, delivering the data in a simple parsed format, obviating the need to post-process the alignment results. trieFinder is available at http://research.nhgri.nih.gov/software/trieFinder/ webcite.

【 授权许可】

   
2014 Renaud et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150113174100229.pdf 782KB PDF download
Figure 2. 51KB Image download
Figure 1. 61KB Image download
【 图 表 】

Figure 1.

Figure 2.

【 参考文献 】
  • [1]Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, Kerlavage AR, McCombie WR, Venter JC: Complementary DNA sequencing: expressed sequence tags and human genome project. Science 1991, 252(5013):1651-1656.
  • [2]Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL: Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 1996, 14(13):1675-1680.
  • [3]Lashkari DA, DeRisi JL, McCusker JH, Namath AF, Gentile C, Hwang SY, Brown PO, Davis RW: Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc Natl Acad Sci U S A 1997, 94(24):13057-13062.
  • [4]Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysis of gene expression. Science 1995, 270(5235):484-487.
  • [5]Audic S, Claverie JM: The significance of digital gene expression profiles. Genome Res 1997, 7(10):986-995.
  • [6]Brenner S, Johnson M, Bridgham J, Golda G, Lloyd DH, Johnson D, Luo S, McCurdy S, Foy M, Ewan M, Roth R, George D, Eletr S, Albrecht G, Vermaas E, Williams SR, Moon K, Burcham T, Pallas M, DuBridge RB, Kirchner J, Fearon K, Mao J, Corcoran K: Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol 2000, 18(6):630-634.
  • [7]Morrissy AS, Morin RD, Delaney A, Zeng T, McDonald H, Jones S, Zhao Y, Hirst M, Marra MA: Next-generation tag sequencing for cancer gene expression profiling. Genome Res 2009, 19(10):1825-1835.
  • [8]Liang J, Wang D, Renaud G, Wolfsberg TG, Wilson AF, Burgess SM: The stat3/socs3a pathway is a key regulator of hair cell regeneration in zebrafish. [corrected]. J Neurosci 2012, 32(31):10662-10673.
  • [9]Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995, 270(5235):467-470.
  • [10]Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 2008, 320(5881):1344-1349.
  • [11]Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J: Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 2008, 453(7199):1239-1243.
  • [12]Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 2008, 5(7):621-628.
  • [13]Fredkin E: Trie Memory. Commun Acm 1960, 3(9):490-499.
  • [14]Bradley KM, Elmore JB, Breyer JP, Yaspan BL, Jessen JR, Knapik EW, Smith JR: A major zebrafish polymorphism resource for genetic mapping. Genome Biol 2007, 8(4):R55. BioMed Central Full Text
  • [15]Martin M: Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal 2011, 17(1):10-12.
  • [16]Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 2009, 10(3):R25. BioMed Central Full Text
  文献评价指标  
  下载次数:19次 浏览次数:11次