Biology Direct | |
Highly divergent ancient gene families in metagenomic samples are compatible with additional divisions of life | |
Philippe Lopez1  Sébastien Halary2  Eric Bapteste1  | |
[1] Team ‘Adaptation, Integration, Reticulation, Evolution’ – UMR CNRS 7138 Evolution Paris Seine – Institut de Biologie Paris Seine – Université Pierre et Marie Curie, 7 quai St Bernard, Paris, 75005, France | |
[2] Département de Sciences Biologiques, Institut de recherche en biologie végétale, Université de Montréal, Montréal H1X 2B2, QC, Canada | |
关键词: Environmental diversity; Networks; Comparative analysis; Metagenomics; Microbiology; | |
Others : 1230608 DOI : 10.1186/s13062-015-0092-3 |
|
received in 2015-07-30, accepted in 2015-10-13, 发布年份 2015 | |
【 摘 要 】
Background
Microbial genetic diversity is often investigated via the comparison of relatively similar 16S molecules through multiple alignments between reference sequences and novel environmental samples using phylogenetic trees, direct BLAST matches, or phylotypes counts. However, are we missing novel lineages in the microbial dark universe by relying on standard phylogenetic and BLAST methods? If so, how can we probe that universe using alternative approaches? We performed a novel type of multi-marker analysis of genetic diversity exploiting the topology of inclusive sequence similarity networks.
Results
Our protocol identified 86 ancient gene families, well distributed and rarely transferred across the 3 domains of life, and retrieved their environmental homologs among 10 million predicted ORFs from human gut samples and other metagenomic projects. Numerous highly divergent environmental homologs were observed in gut samples, although the most divergent genes were over-represented in non-gut environments. In our networks, most divergent environmental genes grouped exclusively with uncultured relatives, in maximal cliques. Sequences within these groups were under strong purifying selection and presented a range of genetic variation comparable to that of a prokaryotic domain.
Conclusions
Many genes families included environmental homologs that were highly divergent from cultured homologs in 79 gene families (including 18 ribosomal proteins), Bacteria and Archaea were less divergent than some groups of environmental sequences were to any cultured or viral homologs. Moreover, some groups of environmental homologs branched very deeply in phylogenetic trees of life, when they were not too divergent to be aligned. These results underline how limited our understanding of the most diverse elements of the microbial world remains, and encourage a deeper exploration of natural communities and their genetic resources, hinting at the possibility that still unknown yet major divisions of life have yet to be discovered.
Reviewers
This article was reviewed by Eugene Koonin, William Martin and James McInerney.
【 授权许可】
2015 Lopez et al.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20151107013227370.pdf | 1885KB | download | |
Fig. 5. | 33KB | Image | download |
Fig. 4. | 24KB | Image | download |
Fig. 3. | 72KB | Image | download |
Fig. 2. | 67KB | Image | download |
Fig. 1. | 47KB | Image | download |
【 图 表 】
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
【 参考文献 】
- [1]Hugenholtz P, Pitulle C, Hershberger KL, Pace NR: Novel division level bacterial diversity in a Yellowstone hot spring. J Bacteriol 1998, 180(2):366-376.
- [2]Chouari R, Le Paslier D, Dauga C, Daegelen P, Weissenbach J, Sghir A: Novel major bacterial candidate division within a municipal anaerobic sludge digester. Appl Environ Microbiol 2005, 71(4):2145-2153.
- [3]Rappe MS, Giovannoni SJ: The uncultured microbial majority. Annu Rev Microbiol 2003, 57:369-394.
- [4]Boyer M, Madoui MA, Gimenez G, La Scola B, Raoult D: Phylogenetic and phyletic studies of informational genes in genomes highlight existence of a 4 domain of life including giant viruses. PLoS One 2010., 5(12) Article ID e15530
- [5]Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, Brulc JM, et al.: Functional metagenomic profiling of nine biomes. Nature 2008, 452(7187):629-632.
- [6]Luef B, Frischkorn KR, Wrighton KC, Holman HY, Birarda G, Thomas BC, et al.: Diverse uncultivated ultra-small bacterial cells in groundwater. Nat Commun 2015, 6:6372.
- [7]Marcy Y, Ouverney C, Bik EM, Losekann T, Ivanova N, Martin HG, et al.: Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proc Natl Acad Sci U S A 2007, 104(29):11889-11894.
- [8]Lagier JC, Armougom F, Million M, Hugon P, Pagnier I, Robert C, et al.: Microbial culturomics: paradigm shift in the human gut microbiome study. Clin Microbiol Infect 2012, 18(12):1185-1193.
- [9]Brown CT, Hug LA, Thomas BC, Sharon I, Castelle CJ, Singh A, et al.: Unusual biology across a group comprising more than 15 % of domain Bacteria. Nature 2015, 523(7559):208-211.
- [10]Beja O, Suzuki MT, Heidelberg JF, Nelson WC, Preston CM, Hamada T, et al.: Unsuspected diversity among marine aerobic anoxygenic phototrophs. Nature 2002, 415(6872):630-633.
- [11]Giovannoni SJ, Britschgi TB, Moyer CL, Field KG: Genetic diversity in Sargasso Sea bacterioplankton. Nature 1990, 345(6270):60-63.
- [12]Moreira D, Rodriguez-Valera F, Lopez-Garcia P. Metagenomic analysis of mesopelagic Antarctic plankton reveals a novel deltaproteobacterial group. Microbiology. 2006;152(Pt 2):505–17. doi:. 10.1099/mic.0.28254-0 webcite
- [13]Murray AE, Grzymski JJ: Diversity and genomics of Antarctic marine micro-organisms. Philos Trans R Soc Lond B Biol Sci 2007, 362(1488):2259-2271.
- [14]Chouari R, Le Paslier D, Daegelen P, Ginestet P, Weissenbach J, Sghir A: Molecular evidence for novel planctomycete diversity in a municipal wastewater treatment plant. Appl Environ Microbiol 2003, 69(12):7354-7363.
- [15]Derakshani M, Lukow T, Liesack W: Novel bacterial lineages at the (sub)division level as detected by signature nucleotide-targeted recovery of 16S rRNA genes from bulk soil and rice roots of flooded rice microcosms. Appl Environ Microbiol 2001, 67(2):623-631.
- [16]Barns SM, Delwiche CF, Palmer JD, Pace NR: Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc Natl Acad Sci U S A 1996, 93(17):9188-9193.
- [17]Fuhrman JA, McCallum K, Davis AA: Novel major archaebacterial group from marine plankton. Nature 1992, 356(6365):148-149.
- [18]Lopez-Garcia P, Moreira D, Lopez-Lopez A, Rodriguez-Valera F: A novel haloarchaeal-related lineage is widely distributed in deep oceanic regions. Environ Microbiol 2001, 3(1):72-78.
- [19]Dawson SC, Pace NR. Novel kingdom-level eukaryotic diversity in anoxic environments. Proc Natl Acad Sci U S A. 2002;99(12):8324–9. doi:10.1073/pnas.062169599.
- [20]Le Calvez T, Burgaud G, Mahe S, Barbier G, Vandenkoornhuyse P: Fungal diversity in deep-sea hydrothermal ecosystems. Appl Environ Microbiol 2009, 75(20):6415-6421.
- [21]Slapeta J, Moreira D, Lopez-Garcia P: The extent of protist diversity: insights from molecular ecology of freshwater eukaryotes. Proc Biol Sci 2005, 272(1576):2073-2081.
- [22]Lecroq B, Lejzerowicz F, Bachar D, Christen R, Esling P, Baerlocher L, et al.: Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments. Proc Natl Acad Sci U S A 2011, 108(32):13177-13182.
- [23]Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, Cheng JF, et al.: Insights into the phylogeny and coding potential of microbial dark matter. Nature 2013, 499(7459):431-437.
- [24]Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, et al.: Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in marker gene phylogenetic trees. PLoS One 2011, 6(3):e18011.
- [25]Philippe N, Legendre M, Doutre G, Coute Y, Poirot O, Lescot M, et al.: Pandoraviruses: amoeba viruses with genomes up to 2.5 Mb reaching that of parasitic eukaryotes. Science 2013, 341(6143):281-286.
- [26]Narasingarao P, Podell S, Ugalde JA, Brochier-Armanet C, Emerson JB, Brocks JJ, et al.: De novo metagenomic assembly reveals abundant novel major lineage of Archaea in hypersaline microbial communities. ISME J 2012, 6(1):81-93.
- [27]Sunagawa S, Mende DR, Zeller G, Izquierdo-Carrasco F, Berger SA, Kultima JR, et al.: Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods 2013, 10(12):1196-1199.
- [28]Nielsen HB, Almeida M, Juncker AS, Rasmussen S, Li J, Sunagawa S, et al.: Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat Biotechnol 2014, 32(8):822-828.
- [29]Kurokawa K, Itoh T, Kuwahara T, Oshima K, Toh H, Toyoda A, et al.: Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res 2007, 14(4):169-181.
- [30]Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, et al.: A human gut microbial gene catalogue established by metagenomic sequencing. Nature 2010, 464(7285):59-65.
- [31]Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, et al.: Enterotypes of the human gut microbiome. Nature 2011, 473(7346):174-180.
- [32]Lynch MD, Bartram AK, Neufeld JD: Targeted recovery of novel phylogenetic diversity from next-generation sequence data. ISME J 2012, 6(11):2067-2077.
- [33]Alvarez-Ponce D, Lopez P, Bapteste E, McInerney JO: Gene similarity networks provide tools for understanding eukaryote origins and evolution. Proc Natl Acad Sci U S A 2013, 110(17):E1594-E1603.
- [34]Wu M, Eisen JA: A simple, fast, and accurate method of phylogenomic inference. Genome Biol 2008, 9(10):R151. BioMed Central Full Text
- [35]Wu D, Jospin G, Eisen JA: Systematic identification of gene families for use as "markers" for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups. PLoS One 2013, 8(10):e77033.
- [36]Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, et al.: The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 2008, 9:386. BioMed Central Full Text
- [37]Haggerty LS, Jachiet PA, Hanage WP, Fitzpatrick DA, Lopez P, O'Connell MJ, et al.: A pluralistic account of homology: adapting the models to the data. Mol Biol Evol 2014, 31(3):501-516.
- [38]Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al.: QIIME allows analysis of high-throughput community sequencing data. Nat Methods 2010, 7(5):335-336.
- [39]Martin DP, Lemey P, Lott M, Moulton V, Posada D, Lefeuvre P: RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics 2010, 26(19):2462-2463.
- [40]Castresana J: Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 2000, 17(4):540-552.
- [41]Lapointe FJ, Lopez P, Boucher Y, Koenig J, Bapteste E: Clanistics: a multi-level perspective for harvesting unrooted gene trees. Trends Microbiol 2010, 18(8):341-347.
- [42]Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P: Toward automatic reconstruction of a highly resolved tree of life. Science 2006, 311(5765):1283-1287.
- [43]Williams TA, Embley TM, Heinz E: Informational gene phylogenies do not support a fourth domain of life for nucleocytoplasmic large DNA viruses. PLoS One 2011, 6(6):e21080.
- [44]Nasir A, Kim KM, Caetano-Anolles G: Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea, Bacteria and Eukarya. BMC Evol Biol 2012, 12:156. BioMed Central Full Text
- [45]Woese CR, Fox GE: Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A 1977, 74(11):5088-5090.
- [46]Noguchi H, Taniguchi T, Itoh T: MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res 2008, 15(6):387-396.
- [47]Beauregard-Racine J, Bicep C, Schliep K, Lopez P, Lapointe FJ, Bapteste E. Of woods and webs: possible alternatives to the tree of life for studying genomic fluidity in E. coli. Biol Direct. 2011;6:39; discussion doi:10.1186/1745-6150-6-39.
- [48]Leskovec J, Lang KJ, Dasgupta A, Mahoney MW. Statistical properties of community structure in large social and information networks. 17th international conference on World Wide Web.; Beijing, China2008. p. 695–704.
- [49]Makino K, Uno T. New Algorithms for Enumerating All Maximal Cliques. In: Hagerup T, Katajainen J, editors. Algorithm Theory - SWAT 2004. Humlebaek, Denmark: Springer; 2004. p. 260-72.
- [50]Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792-1797.
- [51]Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 2007, 24(8):1586-1591.
- [52]Yang Z, Nielsen R: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol 2000, 17(1):32-43.
- [53]Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 2010, 59(3):307-321.