Genome Biology | |
A systematic comparison of chloroplast genome assembly tools | |
Jan A. Freudenthal1  Niklas Terhoeven2  Markus J. Ankenbrand2  Simon Pfaff3  Arthur Korte4  Frank Förster5  | |
[1] Center for Computational and Theoretical Biology, University of Würzburg, Campus Hubland Nord, 97074, Würzburg, Germany;Center for Computational and Theoretical Biology, University of Würzburg, Campus Hubland Nord, 97074, Würzburg, Germany;AnaLife Data Science, Wiesengrund 16, 97295 Waldbrunn, Würzburg, Germany;Center for Computational and Theoretical Biology, University of Würzburg, Campus Hubland Nord, 97074, Würzburg, Germany;AnaLife Data Science, Wiesengrund 16, 97295 Waldbrunn, Würzburg, Germany;Chair of Cellular and Molecular Imaging, Comprehensive Heart Failure Center, University Hospital Würzburg, Josef-Schneider-Str. 2, 97080, Würzburg, Germany;Center for Computational and Theoretical Biology, University of Würzburg, Campus Hubland Nord, 97074, Würzburg, Germany;Department of Bioinformatics, University of Würzburg, Biozentrum, Am Hubland, 97074, Würzburg, Germany;Center for Computational and Theoretical Biology, University of Würzburg, Campus Hubland Nord, 97074, Würzburg, Germany;Department of Bioinformatics, University of Würzburg, Biozentrum, Am Hubland, 97074, Würzburg, Germany;Fraunhofer IME-BR, Ohlebergsweg 12, 35392, Gießen, Germany;Bioinformatics Core Facility of the University of Gießen, Heinrich-Buff-Ring 58, 35392, Gießen, Germany; | |
关键词: Chloroplast; Genome; Assembly; Software; Benchmark; | |
DOI : 10.1186/s13059-020-02153-6 | |
来源: Springer | |
【 摘 要 】
BackgroundChloroplasts are intracellular organelles that enable plants to conduct photosynthesis. They arose through the symbiotic integration of a prokaryotic cell into an eukaryotic host cell and still contain their own genomes with distinct genomic information. Plastid genomes accommodate essential genes and are regularly utilized in biotechnology or phylogenetics. Different assemblers that are able to assess the plastid genome have been developed. These assemblers often use data of whole genome sequencing experiments, which usually contain reads from the complete chloroplast genome.ResultsThe performance of different assembly tools has never been systematically compared. Here, we present a benchmark of seven chloroplast assembly tools, capable of succeeding in more than 60% of known real data sets. Our results show significant differences between the tested assemblers in terms of generating whole chloroplast genome sequences and computational requirements. The examination of 105 data sets from species with unknown plastid genomes leads to the assembly of 20 novel chloroplast genomes.ConclusionsWe create docker images for each tested tool that are freely available for the scientific community and ensure reproducibility of the analyses. These containers allow the analysis and screening of data sets for chloroplast genomes using standard computational infrastructure. Thus, large scale screening for chloroplasts within genomic sequencing data is feasible.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202104287228500ZK.pdf | 3683KB | download |