期刊论文详细信息
Genome Biology
Kraken: ultrafast metagenomic sequence classification using exact alignments
Steven L Salzberg1  Derrick E Wood2 
[1] Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA;Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
关键词: microbiome;    next-generation sequencing;    sequence alignment;    sequence classification;    metagenomics;   
Others  :  863283
DOI  :  10.1186/gb-2014-15-3-r46
 received in 2013-11-17, accepted in 2014-03-03,  发布年份 2014
PDF
【 摘 要 】

Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/ webcite.

【 授权许可】

   
2014 Wood and Salzberg; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140725034510486.pdf 1364KB PDF download
Fig. 2. 142KB Image download
97KB Image download
112KB Image download
103KB Image download
83KB Image download
【 图 表 】

Fig. 2.

【 参考文献 】
  • [1]Venter C, Remington K, Heidelberg J, Halpern A, Rusch D, Eisen J, Wu D, Paulsen I, Nelson K, Nelson W, Fouts D, Levy S, Knap A, Lomas M, Nealson K, White O, Peterson J, Hoffman J, Parsons R, Baden-Tillson H, Pfannkoch C, Rogers Y-H, Smith H: Environmental genome shotgun sequencing of the Sargasso Sea. Science 2004, 304:66-74.
  • [2]Tyson G, Chapman J, Hugenholtz P, Allen E, Ram R, Richardson P, Solovyev V, Rubin E, Rokhsar D, Banfield J: Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 2004, 428:37-43.
  • [3]Huttenhower C, Gevers D, Knight R, Abubucker S, Badger JH, Chinwalla AT, Creasy HH, Earl AM, FitzGerald MG, Fulton RS, Giglio MG, Hallsworth-Pepin K, Lobos EA, Madupu R, Magrini V, Martin JC, Mitreva M, Muzny DM, Sodergren EJ, Versalovic J, Wollam AM, Worley KC, Wortman JR, Young SK, Zeng Q, Aagaard KM, Abolude OO, Allen-Vercoe E, Alm EJ, Alvarado L, et al.: Structure, function and diversity of the healthy human microbiome. Nature 2012, 486:207-214.
  • [4]Altschul S, Gish W, Miller W, Myers E, Lipman D: Basic local alignment search tool. J Mol Biol 1990, 215:403-410.
  • [5]Brady A, Salzberg SL: Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nat Methods 2009, 6:673-676.
  • [6]Huson D, Auch A, Qi J, Schuster S: MEGAN analysis of metagenomic data. Genome Res 2007, 17:377-386.
  • [7]Brady A, Salzberg S: PhymmBL expanded: confidence scores, custom databases, parallelization and more. Nat Methods 2011, 8:367.
  • [8]Rosen G, Garbarine E, Caseiro D, Polikar R, Sokhansanj B: Metagenome fragment classification using N-mer frequency profiles. Adv Bioinformatics 2008, 2008:1-12.
  • [9]Liu B, Gibbons T, Ghodsi M, Treangen T, Pop M: Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences. BMC Genomics 2011, 12:S4.
  • [10]Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C: Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 2012, 9:811-814.
  • [11]Treangen T, Koren S, Sommer D, Liu B, Astrovskaya I, Ondov B, Darling A, Phillippy A, Pop M: MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013, 14:R2. BioMed Central Full Text
  • [12]Ames SK, Hysom DA, Gardner SN, Lloyd GS, Gokhale MB, Allen JE: Scalable metagenomic taxonomy classification using a reference genome database. Bioinformatics 2013, 29:2253-2260.
  • [13]Kindblom C, Davies JR, Herzberg MC, Svensäter G, Wickström C: Salivary proteins promote proteolytic activity in Streptococcus mitis biovar 2 and Streptococcus mutans. Mol Oral Microbiol 2012, 27:362-372.
  • [14]Foweraker JE, Cooke NJ, Hawkey PM: Ecology of Haemophilus influenzae and Haemophilus parainfluenzae in sputum and saliva and effects of antibiotics on their distribution in patients with lower respiratory tract infections. Antimicrob Agents Chemother 1993, 37:804-809.
  • [15]Könönen E, Saarela M, Karjalainen J, Jousimies-Somer H, Alaluusua S, Asikainen S: Transmission of oral Prevotella melaninogenica between a mother and her young child. Oral Microbiol Immunol 1994, 9:310-314.
  • [16]Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden T: BLAST+: architecture and applications. BMC Bioinformatics 2009, 10:421. BioMed Central Full Text
  • [17]Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA: The MaSuRCA genome assembler. Bioinformatics 2013, 29:2669-2677.
  • [18]Pruitt KD, Tatusova T, Brown GR, Maglott DR: NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 2012, 40:D130-D135.
  • [19]Marçais G, Kingsford C: A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 2011, 27:764-770.
  • [20]Roberts M, Hayes W, Hunt B, Mount S, Yorke J: Reducing storage requirements for biological sequence comparison. Bioinformatics 2004, 20:3363-3369.
  • [21]Magoc T, Pabinger S, Canzar S, Liu X, Su Q, Puiu D, Tallon LJ, Salzberg SL: GAGE-B: an evaluation of genome assemblers for bacterial organisms. Bioinformatics 2013, 29:1718-1725.
  • [22]Holtgrewe M: Mason. http://www.seqan.de/projects/mason/ webcite
  • [23]Mavromatis K, Ivanova N, Barry K, Shapiro H, Goltsman E, McHardy A, Rigoutsos I, Salamov A, Korzeniewski F, Land M, Lapidus A, Grigoriev I, Richardson P, Hugenholtz P, Kyrpides N: Use of simulated data sets to evaluate the fidelity of metagenomic processing methods. Nat Methods 2007, 4:495-500.
  • [24]Ondov B, Bergman N, Phillippy A: Interactive metagenomic visualization in a web browser. BMC Bioinformatics 2011, 12:385. BioMed Central Full Text
  • [25]Kraken homepage http://ccb.jhu.edu/software/kraken/ webcite
  • [26]Kraken GitHub repository https://github.com/DerrickWood/kraken webcite
  文献评价指标  
  下载次数:29次 浏览次数:20次