GigaScience | |
VirAmp: a galaxy-based viral genome assembly pipeline | |
Moriah L Szpara1  Istvan Albert1  Daniel W Renner1  Yinan Wan2  | |
[1] Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park 16802, PA, USA;The Huck Institutes of the Life Sciences, University Park 16802, PA, USA | |
关键词: variation analysis; assembly pipeline; viral genome; herpes simplex virus; Next generation sequencing; | |
Others : 1204335 DOI : 10.1186/s13742-015-0060-y |
|
received in 2014-07-14, accepted in 2015-04-09, 发布年份 2015 | |
【 摘 要 】
Background
Advances in next generation sequencing make it possible to obtain high-coverage sequence data for large numbers of viral strains in a short time. However, since most bioinformatics tools are developed for command line use, the selection and accessibility of computational tools for genome assembly and variation analysis limits the ability of individual labs to perform further bioinformatics analysis.
Findings
We have developed a multi-step viral genome assembly pipeline named VirAmp, which combines existing tools and techniques and presents them to end users via a web-enabled Galaxy interface. Our pipeline allows users to assemble, analyze, and interpret high coverage viral sequencing data with an ease and efficiency that was not possible previously. Our software makes a large number of genome assembly and related tools available to life scientists and automates the currently recommended best practices into a single, easy to use interface. We tested our pipeline with three different datasets from human herpes simplex virus (HSV).
Conclusions
VirAmp provides a user-friendly interface and a complete pipeline for viral genome analysis. We make our software available via an Amazon Elastic Cloud disk image that can be easily launched by anyone with an Amazon web service account. A fully functional demonstration instance of our system can be found at http://viramp.com/ webcite. We also maintain detailed documentation on each tool and methodology at http://docs.viramp.com webcite.
【 授权许可】
2015 Wan et al.; licensee BioMed Central.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150524040631566.pdf | 1571KB | download | |
Figure 4. | 41KB | Image | download |
Figure 3. | 72KB | Image | download |
Figure 1. | 57KB | Image | download |
Figure 2. | 75KB | Image | download |
【 图 表 】
Figure 2.
Figure 1.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Salzberg SL, Phillippy AM, Zimin A, Puiu D, Magoc T, Koren S et al.. GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome Res. 2012; 22:557-67.
- [2]Bradnam KR, Fass JN, Alexandrov A, Baranay P, Bechner M, Birol I et al.. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaSci. 2013; 2:10. BioMed Central Full Text
- [3]Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P et al.. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 2005; 15:1451-5.
- [4]Yang X, Charlebois P, Gnerre S, Coole MG, Lennon NJ, Levin JZ et al.. De novo assembly of highly diverse viral populations. BMC Genomics. 2012; 13:475. BioMed Central Full Text
- [5]Maclean D, Jones JDG, Studholme DJ. Application of next generation sequencing technology to microbial genetics. Nat Rev Micro. 2009;7:287–96.
- [6]Pop M, Phillippy A, Delcher AL, Salzberg SL. Comparative genome assembly. Brief Bioinform. 2004; 5:237-48.
- [7]Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008; 18:821-9.
- [8]Brown CT, Howe A, Zhang Q, Pyrkosz AB, Brom TH. A reference-free algorithm for computational normalization of shotgun sequencing data. arXiv preprint arXiv. 2012; 1203:4802.
- [9]seqtk: Toolkit for processing sequences in FASTA/Q formats. [https://github.com/lh3/seqtk/].
- [10]Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012; 9:357-9.
- [11]Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS et al.. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012; 19:455-77.
- [12]Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977; 74:5463-7.
- [13]Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011; 27:578-9.
- [14]Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25:1754-60.
- [15]Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013; 29:1072-5.
- [16]Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C et al.. Versatile and open software for comparing large genomes. Genome Biol. 2004; 5:R12. BioMed Central Full Text
- [17]Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D et al.. Circos: an information aesthetic for comparative genomics. Genome Res. 2009; 19:1639-45.
- [18]Szpara ML, Parsons L, Enquist LW. Sequence variability in clinical andlaboratory isolates of herpes simplex virus 1 reveals new mutations. J Virol. 2010;84:5303–13.
- [19]Szpara ML, Tafuri YR, Parsons L, Shamim SR, Verstrepen KJ, Legendre M, et al. A wide extent of inter-strain diversity in virulent and vaccine strains of alphaherpesviruses. PLoS Pathog. 2011;7:e1002282.
- [20]Roizman B, Knipe DM, Whitley R. Herpes Simplex Viruses. In: Howley PM and Knipe DM, editors. Fields Virology. 6th ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2013.p. 1823–97.
- [21]McGeoch DJ. The genomes of the human herpesviruses: contents, relationships, and evolution. Annu Rev Microbiol. 1989;43:235–65.
- [22]Hunt M, Kikuchi T, Sanders M, Newbold C, Berriman M, Otto TD. REAPR: a universal tool for genome assembly evaluation. Genome Biol. 2013; 14:R47. BioMed Central Full Text
- [23]Wan Y, Renner DW, Albert I, Szpara ML. Supporting materials for: “VirAmp: A Galaxy-based viral genome assembly pipeline”. GigaScience Database. 2014. [http://dx.doi.org/10.5524/100113