期刊论文详细信息
Biotechnology for Biofuels
Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
Sebastian GA Konietzny2  Phillip B Pope1  Aaron Weimann2  Alice C McHardy2 
[1] Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, 1432, Norway
[2] Department of Algorithmic Bioinformatics, Heinrich Heine University Düsseldorf, Düsseldorf 40225, Germany
关键词: Gene clusters;    Polysaccharide utilization loci;    Feature ranking;    Pectin degradation;    Phenotype-based identification of functional modules;    Plant biomass degradation;    (Ligno)cellulose degradation;    Probabilistic topic models;    LDA;    Latent Dirichlet allocation;   
Others  :  1084587
DOI  :  10.1186/s13068-014-0124-8
 received in 2014-04-19, accepted in 2014-08-05,  发布年份 2014
PDF
【 摘 要 】

Background

Efficient industrial processes for converting plant lignocellulosic materials into biofuels are a key to global efforts to come up with alternative energy sources to fossil fuels. Novel cellulolytic enzymes have been discovered in microbial genomes and metagenomes of microbial communities. However, the identification of relevant genes without known homologs, and the elucidation of the lignocellulolytic pathways and protein complexes for different microorganisms remain challenging.

Results

We describe a new computational method for the targeted discovery of functional modules of plant biomass-degrading protein families, based on their co-occurrence patterns across genomes and metagenome datasets, and the strength of association of these modules with the genomes of known degraders. From approximately 6.4 million family annotations for 2,884 microbial genomes, and 332 taxonomic bins from 18 metagenomes, we identified 5 functional modules that are distinctive for plant biomass degraders, which we term “plant biomass degradation modules” (PDMs). These modules incorporate protein families involved in the degradation of cellulose, hemicelluloses, and pectins, structural components of the cellulosome, and additional families with potential functions in plant biomass degradation. The PDMs were linked to 81 gene clusters in genomes of known lignocellulose degraders, including previously described clusters of lignocellulolytic genes. On average, 70% of the families of each PDM were found to map to gene clusters in known degraders, which served as an additional confirmation of their functional relationships. The presence of a PDM in a genome or taxonomic metagenome bin furthermore allowed us to accurately predict the ability of any particular organism to degrade plant biomass. For 15 draft genomes of a cow rumen metagenome, we used cross-referencing to confirmed cellulolytic enzymes to validate that the PDMs identified plant biomass degraders within a complex microbial community.

Conclusions

Functional modules of protein families that are involved in different aspects of plant cell wall degradation can be inferred from co-occurrence patterns across (meta-)genomes with a probabilistic topic model. PDMs represent a new resource of protein families and candidate genes implicated in microbial plant biomass degradation. They can also be used to predict the plant biomass degradation ability for a genome or taxonomic bin. The method is also suitable for characterizing other microbial phenotypes.

【 授权许可】

   
2014 Konietzny et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150113162834346.pdf 3391KB PDF download
Figure 5. 65KB Image download
Figure 4. 135KB Image download
Figure 3. 82KB Image download
Figure 1. 30KB Image download
Figure 1. 87KB Image download
【 图 表 】

Figure 1.

Figure 1.

Figure 3.

Figure 4.

Figure 5.

【 参考文献 】
  • [1]Kumar R, Singh S, Singh OV: Bioconversion of lignocellulosic biomass: biochemical and molecular perspectives. J Ind Microbiol Biotechnol 2008, 35:377-391.
  • [2]Kohse-Hoinghaus K, Osswald P, Cool TA, Kasper T, Hansen N, Qi F, Westbrook CK, Westmoreland PR: Biofuel combustion chemistry: from ethanol to biodiesel. Angew Chem Int Ed Engl 2010, 49:3572-3597.
  • [3]Himmel ME, Ding SY, Johnson DK, Adney WS, Nimlos MR, Brady JW, Foust TD: Biomass recalcitrance: engineering plants and enzymes for biofuels production. Science 2007, 315:804-807.
  • [4]Gowen CM, Fong SS: Exploring biodiversity for cellulosic biofuel production. Chem Biodivers 2010, 7:1086-1097.
  • [5]Xing MN, Zhang XZ, Huang H: Application of metagenomic techniques in mining enzymes from microbial communities for biofuel synthesis. Biotechnol Adv 2012, 30:920-929.
  • [6]Minic Z, Jouanin L: Plant glycoside hydrolases involved in cell wall polysaccharide degradation. Plant Physiol Biochem 2006, 44:435-449.
  • [7]Burton RA, Gidley MJ, Fincher GB: Heterogeneity in the chemistry, structure and function of plant cell walls. Nat Chem Biol 2010, 6:724-732.
  • [8]Sweeney MD, Xu F: Biomass converting enzymes as industrial biocatalysts for fuels and chemicals: Recent developments. Catalysts 2012, 2:244-263.
  • [9]Gilbert HJ, Stalbrand H, Brumer H: How the walls come crumbling down: recent structural biochemistry of plant polysaccharide degradation. Curr Opin Plant Biol 2008, 11:338-348.
  • [10]Jayani RS, Saxena S, Gupta R: Microbial pectinolytic enzymes: a review. Process Biochem 2005, 40:2931-2944.
  • [11]Cantarel BL, Coutinho PM, Rancurel C, Bernard T, Lombard V, Henrissat B: The carbohydrate-active enzymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res 2009, 37:D233-D238.
  • [12]Morais S, Barak Y, Lamed R, Wilson DB, Xu Q, Himmel ME, Bayer EA: Paradigmatic status of an endo- and exoglucanase and its effect on crystalline cellulose degradation. Biotechnol Biofuels 2012, 5:78. BioMed Central Full Text
  • [13]Wilson DB: Microbial diversity of cellulose hydrolysis. Curr Opin Microbiol 2011, 14:259-263.
  • [14]Fontes CM, Gilbert HJ: Cellulosomes: highly efficient nanomachines designed to deconstruct plant cell wall complex carbohydrates. Annu Rev Biochem 2010, 79:655-681.
  • [15]Martens EC, Koropatkin NM, Smith TJ, Gordon JI: Complex glycan catabolism by the human gut microbiota: the Bacteroidetes Sus-like paradigm. J Biol Chem 2009, 284:24673-24677.
  • [16]Bolam DN, Koropatkin NM: Glycan recognition by the Bacteroidetes Sus-like systems. Curr Opin Struct Biol 2012, 22:563-569.
  • [17]Wilson D: Evidence for a novel mechanism of microbial cellulose degradation. Cellulose 2009, 16:723-727.
  • [18]Horn SJ, Vaaje-Kolstad G, Westereng B, Eijsink VG: Novel enzymes for the degradation of cellulose. Biotechnol Biofuels 2012, 5:45. BioMed Central Full Text
  • [19]Hess M, Sczyrba A, Egan R, Kim TW, Chokhawala H, Schroth G, Luo S, Clark DS, Chen F, Zhang T, Mackie RI, Pennacchio LA, Tringe SG, Visel A, Woyke T, Wang Z, Rubin EM: Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 2011, 331:463-467.
  • [20]Pope PB, Mackenzie AK, Gregor I, Smith W, Sundset MA, McHardy AC, Morrison M, Eijsink VG: Metagenomics of the Svalbard reindeer rumen microbiome reveals abundance of polysaccharide utilization loci. PLoS ONE 2012, 7:e38571.
  • [21]Graham JE, Clark ME, Nadler DC, Huffer S, Chokhawala HA, Rowland SE, Blanch HW, Clark DS, Robb FT: Identification and characterization of a multidomain hyperthermophilic cellulase from an archaeal enrichment. Nat Commun 2011, 2:375.
  • [22]Kim SJ, Lee CM, Han BR, Kim MY, Yeo YS, Yoon SH, Koo BS, Jun HK: Characterization of a gene encoding cellulase from uncultured soil bacteria. FEMS Microbiol Lett 2008, 282:44-51.
  • [23]Wang F, Li F, Chen G, Liu W: Isolation and characterization of novel cellulase genes from uncultured microorganisms in different environmental niches. Microbiol Res 2009, 164:650-657.
  • [24]Duan C-J, Feng J-X: Mining metagenomes for novel cellulase genes. Biotechnol Lett 2010, 32:1765-1775.
  • [25]Rubin EM: Genomics of cellulosic biofuels. Nature 2008, 454:841-845.
  • [26]Park BH, Karpinets TV, Syed MH, Leuze MR, Uberbacher EC: CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. Glycobiology 2010, 20:1574-1584.
  • [27]Wang PI, Marcotte EM: It's the machine that matters: predicting gene function and phenotype from protein networks. J Proteomics 2010, 73:2277-2289.
  • [28]Weimann A, Trukhina Y, Pope PB, Konietzny SG, McHardy AC: De novo prediction of the genomic components and capabilities for microbial plant biomass degradation from (meta-)genomes. Biotechnol Biofuels 2013, 6:24. BioMed Central Full Text
  • [29]Kastenmüller G, Schenk ME, Gasteiger J, Mewes HW: Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes. Genome Biol 2009, 10:R28. BioMed Central Full Text
  • [30]Yosef N, Gramm J, Wang Q-F, Noble WS, Karp RM, Sharan R: Prediction of phenotype information from genotype data. Commun Inf Syst 2010, 10:99-114.
  • [31]Vey G, Moreno-Hagelsieb G: Metagenomic annotation networks: construction and applications. PLoS ONE 2012, 7:e41283.
  • [32]Padmanabhan K, Wilson K, Rocha AM, Wang K, Mihelcic JR, Samatova NF: In-silico identification of phenotype-biased functional modules. Proteome Sci 2012, 10(Suppl 1):S2. BioMed Central Full Text
  • [33]Slonim N, Elemento O, Tavazoie S: Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks. Mol Syst Biol 2006, 2:1-14.
  • [34]Lingner T, Muhlhausen S, Gabaldon T, Notredame C, Meinicke P: Predicting phenotypic traits of prokaryotes from protein domain frequencies. BMC Bioinformatics 2010, 11:481. BioMed Central Full Text
  • [35]Jeffery C: Moonlighting proteins: implications and complications for proteomics. Protein Sci 2004, 13:124-124.
  • [36]Liu B, Pop M: MetaPath: identifying differentially abundant metabolic pathways in metagenomic datasets. BMC Proc 2011, 5(Suppl 2):S9. BioMed Central Full Text
  • [37]Schmidt MC, Rocha AM, Padmanabhan K, Shpanskaya Y, Banfield J, Scott K, Mihelcic JR, Samatova NF: NIBBS-search for fast and accurate prediction of phenotype-biased metabolic systems. PLoS Comput Biol 2012, 8:e1002490.
  • [38]De Filippo C, Ramazzotti M, Fontana P, Cavalieri D: Bioinformatic approaches for functional annotation and pathway inference in metagenomics data. Brief Bioinform 2012, 13:696-710.
  • [39]Aravind L: Guilt by association: contextual information in genome analysis. Genome Res 2000, 10:1074-1077.
  • [40]Kensche PR, van Noort V, Dutilh BE, Huynen MA: Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution. J R Soc Interface 2008, 5:151-170.
  • [41]Blei DM, Ng AY, Jordan MI: Latent dirichlet allocation. J Mach Learn Res 2003, 3:993-1022.
  • [42]Konietzny SG, Dietz L, McHardy AC: Inferring functional modules of protein families with probabilistic topic models. BMC Bioinformatics 2011, 12:141. BioMed Central Full Text
  • [43]von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 2005, 33:D433-D437.
  • [44]Medie FM, Davies GJ, Drancourt M, Henrissat B: Genome analyses highlight the different biological roles of cellulases. Nat Rev Microbiol 2012, 10:227-234.
  • [45]Berlemont R, Martiny AC: Phylogenetic distribution of potential cellulases in bacteria. Appl Environ Microbiol 2013, 79:1545-1554.
  • [46]Gilks WR, Richardson S, Spiegelhalter DJ: Markov Chain Monte Carlo in Practice. Chapman and Hall/CRC, Boca Raton, Florida, USA; 1999.
  • [47]Himmel ME, Xu Q, Luo Y, Ding S-Y, Lamed R, Bayer EA: Microbial enzyme systems for biomass conversion: emerging paradigms. Biofuels 2010, 1:323-341.
  • [48]Boraston AB, Bolam DN, Gilbert HJ, Davies GJ: Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem J 2004, 382:769-781.
  • [49]McCartney L, Blake AW, Flint J, Bolam DN, Boraston AB, Gilbert HJ, Knox JP: Differential recognition of plant cell walls by microbial xylan-specific carbohydrate-binding modules. Proc Natl Acad Sci U S A 2006, 103:4765-4770.
  • [50]Overbeek R, Fonstein M, D’Souza M, Pusch GD, Maltsev N: The use of gene clusters to infer functional coupling. Proc Natl Acad Sci U S A 1999, 96:2896-2901.
  • [51]Ballouz S, Francis AR, Lan R, Tanaka MM: Conditions for the evolution of gene clusters in bacterial genomes. PLoS Comput Biol 2010, 6:e1000672.
  • [52]Duda RO, Hart PE, Stork DG: Pattern Classification.605 Third Avenue. John Wiley & Sons, New York, USA; 2012.
  • [53]Anguita D, Ghelardoni L, Ghio A, Ridella S: Test Error Bounds for Classifiers: A Survey of Old and New Results. In Proceedings of the IEEE Symposium on Foundations of Computational Intelligence (FOCI) 2011. Paris, France; 2011:80–87.
  • [54]Lewis DD: Evaluating and optimizing autonomous text classification systems. In Proceedings of the 18th annual international ACM-SIGIR conference on Research and Development in Information Retrieval. ACM, Seattle, WA; 1995:246-254.
  • [55]Anderson I, Abt B, Lykidis A, Klenk HP, Kyrpides N, Ivanova N: Genomics of aerobic cellulose utilization systems in actinobacteria. PLoS ONE 2012, 7:e39331.
  • [56]Boraston AB, Tomme P, Amandoron EA, Kilburn DG: A novel mechanism of xylan binding by a lectin-like module from Streptomyces lividans xylanase 10A. Biochem J 2000, 350(Pt 3):933-941.
  • [57]Blouzard J-C, Coutinho PM, Fierobe H-P, Henrissat B, Lignon S, Tardif C, Pagès S, de Philip P: Modulation of cellulosome composition in Clostridium cellulolyticum: adaptation to the polysaccharide environment revealed by proteomic and carbohydrate-active enzyme analyses. Proteomics 2010, 10:541-554.
  • [58]Kotake T, Dina S, Konishi T, Kaneko S, Igarashi K, Samejima M, Watanabe Y, Kimura K, Tsumuraya Y: Molecular cloning of a b-galactosidase from radish that specifically hydrolyzes b-(1- > 3)- and b-(1- > 6)-galactosyl residues of arabinogalactan protein. Plant Physiol 2005, 138:1563-1576.
  • [59]Olson DG, Giannone RJ, Hettich RL, Lynd LR: Role of the CipA scaffoldin protein in cellulose solubilization, as determined by targeted gene deletion and complementation in Clostridium thermocellum. J Bacteriol 2013, 195:733-739.
  • [60]Warnecke F, Luginbuhl P, Ivanova N, Ghassemian M, Richardson TH, Stege JT, Cayouette M, McHardy AC, Djordjevic G, Aboushadi N, Sorek R, Tringe SG, Podar M, Martin HG, Kunin V, Dalevi D, Madejska J, Kirton E, Platt D, Szeto E, Salamov A, Barry K, Mikhailova N, Kyrpides NC, Matson EG, Ottesen EA, Zhang X, Hernandez M, Murillo C, Acosta LG, et al.: Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite. Nature 2007, 450:560-565.
  • [61]Schwarz WH: The cellulosome and cellulose degradation by anaerobic bacteria. Appl Microbiol Biotechnol 2001, 56:634-649.
  • [62]Kitago Y, Karita S, Watanabe N, Kamiya M, Aizawa T, Sakka K, Tanaka I: Crystal structure of Cel44A, a glycoside hydrolase family 44 endoglucanase from Clostridium thermocellum. J Biol Chem 2007, 282:35703-35711.
  • [63]Yoshida S, Hespen CW, Beverly RL, Mackie RI, Cann IK: Domain analysis of a modular a-L-arabinofuranosidase with a unique carbohydrate binding strategy from the fiber-degrading bacterium Fibrobacter succinogenes S85. J Bacteriol 2010, 192:5424-5436.
  • [64]Yoshida S, Mackie RI, Cann IK: Biochemical and domain analyses of FSUAxe6B, a modular acetyl xylan esterase, identify a unique carbohydrate binding module in Fibrobacter succinogenes S85. J Bacteriol 2010, 192:483-493.
  • [65]Mackenzie AK, Pope PB, Pedersen HL, Gupta R, Morrison M, Willats WG, Eijsink VG: Two SusD-like proteins encoded within a polysaccharide utilization locus of an uncultured ruminant bacteroidetes phylotype bind strongly to cellulose. Appl Environ Microbiol 2012, 78:5935-5937.
  • [66]Pope PB, Denman SE, Jones M, Tringe SG, Barry K, Malfatti SA, McHardy AC, Cheng JF, Hugenholtz P, McSweeney CS, Morrison M: Adaptation to herbivory by the Tammar wallaby includes bacterial and glycoside hydrolase profiles different from other herbivores. Proc Natl Acad Sci U S A 2010, 107:14793-14798.
  • [67]Dröge J, McHardy AC: Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. Brief Bioinform 2012, 13:646-655.
  • [68]Flint HJ, Bayer EA, Rincon MT, Lamed R, White BA: Polysaccharide utilization by gut bacteria: potential for new insights from genomic analysis. Nat Rev Microbiol 2008, 6:121-131.
  • [69]Naas AE, Mackenzie AK JM, Schückel J, Willats WGT, Eijsink VGH, Pope PB: Do rumen Bacteroidetes utilize an alternative mechanism for cellulose degradation? mBio 2014, 5:e01401-e01414.
  • [70]Morrison M, Pope PB, Denman SE, McSweeney CS: Plant biomass degradation by gut microbiomes: more of the same or something new? Curr Opin Biotechnol 2009, 20:358-363.
  • [71]Martens EC, Lowe EC, Chiang H, Pudlo NA, Wu M, McNulty NP, Abbott DW, Henrissat B, Gilbert HJ, Bolam DN, Gordon JI: Recognition and degradation of plant cell wall polysaccharides by two human gut symbionts. PLoS Biol 2011, 9:e1001221.
  • [72]McNulty NP, Wu M, Erickson AR, Pan C, Erickson BK, Martens EC, Pudlo NA, Muegge BD, Henrissat B, Hettich RL, Gordon JI: Effects of diet on resource utilization by a model human gut microbiota containing Bacteroides cellulosilyticus WH2, a symbiont with an extensive glycobiome. PLoS Biol 2013, 11:e1001637.
  • [73]Floudas D, Binder M, Riley R, Barry K, Blanchette RA, Henrissat B, Martinez AT, Otillar R, Spatafora JW, Yadav JS, Aerts A, Benoit I, Boyd A, Carlson A, Copeland A, Coutinho PM, de Vries RP, Ferreira P, Findley K, Foster B, Gaskell J, Glotzer D, Górecki P, Heitman J, Hesse C, Hori C, Igarashi K, Jurgens JA, Kallen N, Kersten P, et al.: The paleozoic origin of enzymatic lignin decomposition reconstructed from 31 fungal genomes. Science 2012, 336:1715-1719.
  • [74]Steyvers M, Griffiths T: Probabilistic Topic Models. In Handbook of Latent Semantic Analysis. Volume 427. Edited by Landauer T, McNamara D, Dennis S, Kintsch W. Laurence Erlbaum, Colorado, USA; 2007:427-440.
  • [75]Griffiths TL, Steyvers M: Finding scientific topics. Proc Natl Acad Sci U S A 2004, 101(Suppl 1):5228-5235.
  • [76]Zhu W, Lomsadze A, Borodovsky M: Ab initio gene identification in metagenomic sequences. Nucleic Acids Res 2010, 38:e132.
  • [77]McHardy AC, Martin HG, Tsirigos A, Hugenholtz P, Rigoutsos I: Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods 2007, 4:63-72.
  • [78]Patil KR, Haider P, Pope PB, Turnbaugh PJ, Morrison M, Scheffer T, McHardy AC: Taxonomic metagenome sequence assignment with structured output models. Nat Methods 2011, 8:191-192.
  • [79]Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y: dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 2012, 40:W445-W451.
  • [80]Eddy SR: Accelerated profile HMM searches. PLoS Comput Biol 2011, 7:e1002195.
  • [81]Friedberg I: Automated protein function prediction–the genomic challenge. Brief Bioinform 2006, 7:225-242.
  • [82]Friedman N: Inferring cellular networks using probabilistic graphical models. Science 2004, 303:799-805.
  • [83]Wilkinson DJ: Bayesian methods in bioinformatics and computational systems biology. Brief Bioinform 2007, 8:109-116.
  • [84]Pagani I, Liolios K, Jansson J, Chen IM, Smirnova T, Nosrat B, Markowitz VM, Kyrpides NC: The genomes online database (GOLD) v. 4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2012, 40:D571-D579.
  • [85][http://www.dsmz.de/] webcite Deutsche Sammlung von Mikroorganismen und Zellkulturen. []
  • [86]Van Rijsbergen CJ: Information Retrieval. Butterworths, London, Boston; 1979.
  • [87]Levandowsky M, Winter D: Distance between sets. Nature 1971, 234:34-35.
  • [88]Kuhn HW: The Hungarian method for the assignment problem. Nav Res Log 1955, 2:83-97.
  • [89]Bron C, Kerbosch J: Algorithm 457: finding all cliques of an undirected graph. Commun ACM 1973, 16:575-577.
  • [90]Letunic I, Bork P: Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res 2011, 39:W475-W478.
  • [91][http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm] webcite Matlab Topic Modeling Toolbox. []
  • [92]Wilson DB: Three microbial strategies for plant cell wall degradation. Ann N Y Acad Sci 2008, 1125:289-297.
  • [93]Suen G, Weimer PJ, Stevenson DM, Aylward FO, Boyum J, Deneke J, Drinkwater C, Ivanova NN, Mikhailova N, Chertkov O, Goodwin LA, Currie CR, Mead D, Brumm PJ: The complete genome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist. PLoS ONE 2011, 6:e18814.
  文献评价指标  
  下载次数:21次 浏览次数:12次