Genome Biology | |
Complete vertebrate mitogenomes reveal widespread repeats and gene duplications | |
The Vertebrate Genomes Project Consortium1  Arkarachai Fungtammasan2  Marco Rosario Capodiferro3  Alessandro Achilli3  Peter Houde4  Edward L. Braun5  David S. Horner6  Matteo Chiara6  Roberto Ambrosini7  Sergey Koren8  Arang Rhie8  Adam M. Phillippy8  Woori Kwak9  Samara Brown1,10  Eugene Myers1,11  Sylke Winkler1,11  Farooq O. Al-Ajli1,12  Vania Costa1,13  Daniel Fordham1,13  Simon Mayes1,13  Jonas Korlach1,14  Bettina Haase1,15  Giulio Formenti1,15  Jacquelyn Mountcastle1,15  Jennifer Balacco1,15  Erich D. Jarvis1,15  Olivier Fedrigo1,15  Shane McCarthy1,16  Richard Durbin1,16  James Torrance1,16  Craig Corton1,16  Jason Skelton1,16  Jonathan Wood1,16  Emma Betteridge1,16  Karen Oliver1,16  Alan Tracey1,16  Jale Dolucan1,16  Michelle Smith1,16  Iliana Bista1,16  Marcela Uliano-Silva1,16  Kerstin Howe1,16  | |
[1] ;DNAnexus Inc.;Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia;Department of Biology, New Mexico State University;Department of Biology, University of Florida;Department of Biosciences, University of Milan;Department of Environmental Science and Policy, University of Milan;Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health;Hoonygen;Laboratory of Neurogenetics of Language, Rockefeller University;Max Planck Institute of Molecular Cell Biology & Genetics;Monash University Malaysia Genomics Facility, School of Science;Oxford Nanopore Technologies Ltd, Oxford Science Park;Pacific Biosciences;The Vertebrate Genome Lab, Rockefeller University;Wellcome Sanger Institute; | |
关键词: Mitochondrial DNA; Vertebrate; Assembly; Long reads; Sequencing; Duplications; | |
DOI : 10.1186/s13059-021-02336-9 | |
来源: DOAJ |
【 摘 要 】
Abstract Background Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions Our results indicate that even in the “simple” case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.
【 授权许可】
Unknown