BMC Bioinformatics | |
afterParty: turning raw transcriptomes into permanent resources | |
Martin Jones1  Mark Blaxter1  | |
[1] Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK | |
关键词: Annotation; Assembly; Transcriptome; | |
Others : 1087735 DOI : 10.1186/1471-2105-14-301 |
|
received in 2012-12-07, accepted in 2013-10-03, 发布年份 2013 | |
【 摘 要 】
Background
Next-generation DNA sequencing technologies have made it possible to generate transcriptome data for novel organisms quickly and cheaply, to the extent that the effort required to annotate and publish a new transcriptome is greater than the effort required to sequence it. Often, following publication, details of the annotation effort are only available in summary form, hindering subsequent exploitation of the data. To promote best-practice in annotation and to ensure that data remain accessible, we have written afterParty, a web application that allows users to assemble, annotate and publish novel transcriptomes using only a web browser.
Results
afterParty is a robust web application that implements best-practice transcriptome assembly, annotation, browsing, searching, and visualization. Users can turn a collection of reads (from Roche 454 chemistry) or assembled contigs (from any sequencing chemistry, including Illumina Solexa RNA-Seq) into a searchable, browsable transcriptome resource and quickly make it publicly available. Contigs are functionally annotated based on similarity to known sequences and protein domains. Once assembled and annotated, transcriptomes derived from multiple species or libraries can be compared and searched. afterParty datasets can either be created using the existing afterParty server, or using local instances that can be built easily using a virtual machine. afterParty includes powerful visualization tools for transcriptome dataset exploration and uses a flexible annotation architecture which will allow additional types of annotation to be added in the future.
Conclusions
afterParty's main use case scenario is one in which a working biologist has generated a large volume of transcribed sequence data and wishes to turn it into a useful resource that has some durability. By reducing the effort, bioinformatics skills, and computational resources needed to annotate and publish a transcriptome, afterParty will facilitate the annotation and sharing of sequence data that would otherwise remain unavailable. A typical metazoan transcriptome containing several tens of thousands of contigs can be annotated in a few minutes of interactive time and a few days of computational time.
【 授权许可】
2013 Jones and Blaxter; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150117035616744.pdf | 1549KB | download | |
Figure 4. | 134KB | Image | download |
Figure 3. | 104KB | Image | download |
Figure 2. | 52KB | Image | download |
Figure 1. | 57KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol 2008, 26:1135-1145.
- [2]Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA: A map of human genome variation from population-scale sequencing. Nature 2010, 467:1061-1073.
- [3]Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri CG, Baren MJ Van Boley N, Booth BW, Brown JB, Cherbas L, Davis CA, Dobin A, Li R, Lin W, Malone JH, Mattiuzzo NR, Miller D, Sturgill D, Tuch BB, Zaleski C, Zhang D, Blanchette M, Dudoit S, Eads B, Green RE, Hammonds A, Jiang L, Kapranov P, Langton L, et al.: The developmental transcriptome of Drosophila melanogaster. Nature 2011, 471:473-479.
- [4]Feldmeyer B, Wheat CW, Krezdorn N, Rotter B, Pfenninger M: Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance. BMC Genomics 2011, 12:317.
- [5]Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA: Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics 2010, 11:180.
- [6]Martin JA, Wang Z: Next-generation transcriptome assembly. Nat Rev Genet 2011, 12:671-682.
- [7]Reorganizing the protein space at the Universal Protein Resource (UniProt) Nucleic Acids Res 2012, 40:71-75.
- [8]Zdobnov EM, Apweiler R: InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 2001, 17:847-848.
- [9]Sonnhammer EL, von Heijne G, Krogh A: A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 1998, 6:175-182.
- [10]Nielsen H, Brunak S, von Heijne G: Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng 1999, 12:3-9.
- [11]Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000, 25:25-29.
- [12]Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 2012, 40:D109-114.
- [13]Kong Y: Btrim: a fast, lightweight adapter and quality trimming program for next-generation sequencing technologies. Genomics 2011, 98:152-153.
- [14]Lindgreen S: AdapterRemoval: easy cleaning of next-generation sequencing reads. BMC Res Notes 2012, 5:337.
- [15]Kumar S, Blaxter ML: Comparing de novo assemblers for 454 transcriptome data. BMC Genomics 2010, 11:571.
- [16]Mundry M, Bornberg-Bauer E, Sammeth M, Feulner PGD: Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach. PLoS ONE 2012, 7:e31410.
- [17]Parkinson J, Anthony A, Wasmuth J, Schmid R, Hedley A, Blaxter M: PartiGene–constructing partial genomes. Bioinformatics 2004, 20:1398-1404.
- [18]Li P, Ji G, Dong M, Schmidt E, Lenox D, Chen L, Liu Q, Liu L, Zhang J, Liang C: CBrowse: a SAM/BAM-based contig browser for transcriptome assembly visualization and analysis. Bioinformatics 2012, 28:2382-2384.
- [19]Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25:3389-3402.
- [20]Goesmann A, Linke B, Bartels D, Dondrup M, Krause L, Neuweger H, Oehm S, Paczian T, Wilke A, Meyer F: BRIGEP–the BRIDGE-based genome-transcriptome-proteome browser. Nucleic Acids Res 2005, 33:W710-716.
- [21]Lepoivre C, Bergon A, Lopez F, Perumal NB, Nguyen C, Imbert J, Puthier D: TranscriptomeBrowser 3.0: introducing a new compendium of molecular interactions and a new visualization tool for the study of gene regulatory networks. BMC Bioinformatics 2012, 13:19.
- [22]Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res 2002, 12:996-1006.
- [23]Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH: JBrowse: a next-generation genome browser. Genome Res 2009, 19:1630-1638.
- [24]Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S: The generic genome browser: a building block for a model organism system database. Genome Res 2002, 12:1599-1610.
- [25]Bouétard A, Noirot C, Besnard A-L, Bouchez O, Choisne D, Robe E, Klopp C, Lagadic L, Coutellec M-A: Pyrosequencing-based transcriptomic resources in the pond snail Lymnaea stagnalis, with a focus on genes involved in molecular response to diquat-induced stress. Ecotoxicology 2012, 21:2222-2234.
- [26]Papanicolaou A, Gebauer-Jung S, Blaxter ML, Owen McMillan W, Jiggins CD: ButterflyBase: a platform for lepidopteran genomics. Nucleic Acids Res 2008, 36:D582-D587.
- [27]Groovy - Home. http://groovy.codehaus.org/ webcite
- [28]Grails - The search is over. http://grails.org/ webcite
- [29]PostgreSQL: The world’s most advanced open source database. http://www.postgresql.org/ webcite
- [30]Karsch-Mizrachi I, Nakamura Y, Cochrane G: The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res 2012, 40:D33-D37.
- [31]Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Müller WEG, Wetter T, Suhai S: Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res 2004, 14:1147-1159.
- [32]ea-utils - FASTQ processing utilities - Google Project Hosting. http://code.google.com/p/ea-utils/ webcite
- [33]contigimage - create contig images based on .ace file. http://www.animalgenome.org/bioinfo/resources/manuals/contigimage.html webcite
- [34]Käll L, Krogh A, Sonnhammer ELL: A combined transmembrane topology and signal peptide prediction method. J Mol Biol 2004, 338:1027-1036.
- [35]Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998, 8:186-194.
- [36]Contig set comparison. http://afterparty.bio.ed.ac.uk/contigSet/compareContigSets?check_669824=on&check_1440737=on webcite
- [37]Study 454 Sequencing of Litomosoides sigmodontis transcriptome from 3 lifestages. http://afterparty.bio.ed.ac.uk/study/show/5 webcite
- [38]Study Transcriptome of the nematode Anguilicolla crassus. http://afterparty.bio.ed.ac.uk/study/show/1440745 webcite
- [39]Study De novo transcriptome assembly of the grain-eating pest, Plodia interpunctella and its natural viral pathogen Plodia interpunctella granulosis virus. http://afterparty.bio.ed.ac.uk/study/show/2194070 webcite
- [40]Heitlinger E, Bridgett S, Montazam A, Taraschewski H, Blaxter M: The transcriptome of the invasive eel swimbladder nematode parasite Anguillicola crassus. BMC Genomics 2013, 14:87.
- [41]Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 2011, 29:644-652.
- [42]Amazon Glacier. http://aws.amazon.com/glacier/ webcite