BMC Bioinformatics | |
Structure- and context-based analysis of the GxGYxYP family reveals a new putative class of Glycoside Hydrolase | |
Daniel J Rigden3  Ruth Y Eberhardt4  Harry J Gilbert5  Qingping Xu1  Yuanyuan Chang2  Adam Godzik6  | |
[1] Joint Center for Structural Genomics, Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park CA 94025, USA | |
[2] Joint Center for Structural Genomics, Program on Bioinformatics and Systems Biology, Sanford-Burnham Medical Research Institute, La Jolla CA 92037, USA | |
[3] Institute of Integrative Biology, University of Liverpool, Liverpool, UK | |
[4] European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridgeshire CB10 1SD, UK | |
[5] Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Framlington Place, Newcastle Upon Tyne NE2 4HH, UK | |
[6] Joint Center for Structural Genomics, Center for Research in Biological Systems, University of California, San Diego, La Jolla CA 92093, USA | |
关键词: Gut microbiota; Protein family; 3D structure; JCSG; Protein function prediction; PUL; Polysaccharide Utilization Locus; Glycoside hydrolase; Carbohydrate metabolism; | |
Others : 818413 DOI : 10.1186/1471-2105-15-196 |
|
received in 2014-03-24, accepted in 2014-06-10, 发布年份 2014 | |
【 摘 要 】
Background
Gut microbiome metagenomics has revealed many protein families and domains found largely or exclusively in that environment. Proteins containing the GxGYxYP domain are over-represented in the gut microbiota, and are found in Polysaccharide Utilization Loci in the gut symbiont Bacteroides thetaiotaomicron, suggesting their involvement in polysaccharide metabolism, but little else is known of the function of this domain.
Results
Genomic context and domain architecture analyses support a role for the GxGYxYP domain in carbohydrate metabolism. Sparse occurrences in eukaryotes are the result of lateral gene transfer. The structure of the GxGYxYP domain-containing protein encoded by the BT2193 locus reveals two structural domains, the first composed of three divergent repeats with no recognisable homology to previously solved structures, the second a more familiar seven-stranded β/α barrel. Structure-based analyses including conservation mapping localise a presumed functional site to a cleft between the two domains of BT2193. Matching to a catalytic site template from a GH9 cellulase and other analyses point to a putative catalytic triad composed of Glu272, Asp331 and Asp333.
Conclusions
We suggest that GxGYxYP-containing proteins constitute a novel glycoside hydrolase family of as yet unknown specificity.
【 授权许可】
2014 Rigden et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140711101359392.pdf | 2591KB | download | |
Figure 7. | 109KB | Image | download |
Figure 6. | 92KB | Image | download |
Figure 5. | 140KB | Image | download |
Figure 4. | 129KB | Image | download |
Figure 3. | 65KB | Image | download |
Figure 2. | 123KB | Image | download |
Figure 1. | 96KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
【 参考文献 】
- [1]Levitt M: Nature of the protein universe. Proc Natl Acad Sci U S A 2009, 106(27):11079-11084.
- [2]Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, Remington K, Eisen JA, Heidelberg KB, Manning G, Li W, Jaroszewski L, Cieplak P, Miller CS, Li H, Mashiyama ST, Joachimiak MP, van Belle C, Chandonia JM, Soergel DA, Zhai Y, Natarajan K, Lee S, Raphael BJ, Bafna V, Friedman R, Brenner SE, Godzik A, Eisenberg D, Dixon JE, Taylor SS, et al.: The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol 2007, 5(3):e16.
- [3]Clemente JC, Ursell LK, Parfrey LW, Knight R: The impact of the gut microbiota on human health: an integrative view. Cell 2012, 148(6):1258-1270.
- [4]Nicholson JK, Holmes E, Kinross J, Burcelin R, Gibson G, Jia W, Pettersson S: Host-gut microbiota metabolic interactions. Science 2012, 336(6086):1262-1267.
- [5]Ellrott K, Jaroszewski L, Li W, Wooley JC, Godzik A: Expansion of the protein repertoire in newly explored environments: human gut microbiome specific protein families. PLoS Comput Biol 2010, 6(6):e1000798.
- [6]Mello LV, Chen X, Rigden DJ: Mining metagenomic data for novel domains: BACON, a new carbohydrate-binding module. FEBS Lett 2010, 584(11):2421-2426.
- [7]Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T, Mende DR, Li J, Xu J, Li S, Li D, Cao J, Wang B, Liang H, Zheng H, Xie Y, Tap J, Lepage P, Bertalan M, Batto JM, Hansen T, Le Paslier D, Linneberg A, Nielsen HB, Pelletier E, Renault P, et al.: A human gut microbial gene catalogue established by metagenomic sequencing. Nature 2010, 464(7285):59-65.
- [8]Salminen S, Bouley C, Boutron-Ruault MC, Cummings JH, Franck A, Gibson GR, Isolauri E, Moreau MC, Roberfroid M, Rowland I: Functional food science and gastrointestinal physiology and function. Br J Nutr 1998, 80(Suppl 1):S147-S171.
- [9]Marcobal A, Southwick AM, Earle KA, Sonnenburg JL: A refined palate: Bacterial consumption of host glycans in the gut. Glycobiology 2013, 23:1038-1046.
- [10]Martens EC, Chiang HC, Gordon JI: Mucosal glycan foraging enhances fitness and transmission of a saccharolytic human gut bacterial symbiont. Cell Host Microbe 2008, 4(5):447-457.
- [11]Cantarel BL, Coutinho PM, Rancurel C, Bernard T, Lombard V, Henrissat B: The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res 2009, 37(Database issue):D233-D238.
- [12]The UniProt Consortium: Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 2012, 40(D1):D71-D75.
- [13]Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 2011, 28(10):2731-2739.
- [14]Dehal PS, Joachimiak MP, Price MN, Bates JT, Baumohl JK, Chivian D, Friedland GD, Huang KH, Keller K, Novichkov PS, Dubchak IL, Alm EJ, Arkin AP: MicrobesOnline: an integrated portal for comparative and functional genomics. Nucleic Acids Res 2010, 38(Database issue):D396-D400.
- [15]Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD: The Pfam protein families database. Nucleic Acids Res 2012, 40(Database issue):D290-D301.
- [16]Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, et al.: InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 2012, 40(Database issue):D306-D312.
- [17]Pao SS, Paulsen IT, Saier MH Jr: Major facilitator superfamily. Microbiol Mol Biol Rev 1998, 62(1):1-34.
- [18]Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D: Detecting protein function and protein-protein interactions from genome sequences. Science 1999, 285(5428):751-753.
- [19]Henshaw JL, Bolam DN, Pires VM, Czjzek M, Henrissat B, Ferreira LM, Fontes CM, Gilbert HJ: The family 6 carbohydrate binding module CmCBM6-2 contains two ligand-binding sites with distinct specificities. J Biol Chem 2004, 279(20):21552-21559.
- [20]van Bueren AL, Morland C, Gilbert HJ, Boraston AB: Family 6 carbohydrate binding modules recognize the non-reducing end of beta-1,3-linked glucans by presenting a unique ligand binding surface. J Biol Chem 2005, 280(1):530-537.
- [21]Correia MA, Pires VM, Gilbert HJ, Bolam DN, Fernandes VO, Alves VD, Prates JA, Ferreira LM, Fontes CM: Family 6 carbohydrate-binding modules display multiple beta1,3-linked glucan-specific binding interfaces. FEMS Microbiol Lett 2009, 300(1):48-57.
- [22]Boraston AB, Ficko-Blean E, Healey M: Carbohydrate recognition by a large sialidase toxin from Clostridium perfringens. Biochemistry 2007, 46(40):11352-11360.
- [23]Nakjang S, Ndeh DA, Wipat A, Bolam DN, Hirt RP: A novel extracellular metallopeptidase domain shared by animal host-associated mutualistic and pathogenic microbes. PLoS One 2012, 7(1):e30287.
- [24]Grahn E, Askarieh G, Holmner A, Tateno H, Winter HC, Goldstein IJ, Krengel U: Crystal structure of the Marasmius oreades mushroom lectin in complex with a xenotransplantation epitope. J Mol Biol 2007, 369(3):710-721.
- [25]Kadirvelraj R, Grant OC, Goldstein IJ, Winter HC, Tateno H, Fadda E, Woods RJ: Structure and binding analysis of Polyporus squamosus lectin in complex with the Neu5Ac{alpha}2-6Gal{beta}1-4GlcNAc human-type influenza receptor. Glycobiology 2011, 21(7):973-984.
- [26]Pohleven J, Renko M, Magister S, Smith DF, Kunzler M, Strukelj B, Turk D, Kos J, Sabotic J: Bivalent carbohydrate binding is required for biological activity of Clitocybe nebularis lectin (CNL), the N,N'-diacetyllactosediamine (GalNAcbeta1-4GlcNAc, LacdiNAc)-specific lectin from basidiomycete C. nebularis. J Biol Chem 2012, 287(13):10602-10612.
- [27]Sulzenbacher G, Roig-Zamboni V, Peumans WJ, Rouge P, Van Damme EJ, Bourne Y: Crystal structure of the GalNAc/Gal-specific agglutinin from the phytopathogenic ascomycete Sclerotinia sclerotiorum reveals novel adaptation of a beta-trefoil domain. J Mol Biol 2010, 400(4):715-723.
- [28]Buschiazzo A, Tavares GA, Campetella O, Spinelli S, Cremona ML, Paris G, Amaya MF, Frasch AC, Alzari PM: Structural basis of sialyltransferase activity in trypanosomal sialidases. EMBO J 2000, 19(1):16-24.
- [29]Larsbrink J, Izumi A, Ibatullin FM, Nakhai A, Gilbert HJ, Davies GJ, Brumer H: Structural and enzymatic characterization of a glycoside hydrolase family 31 alpha-xylosidase from Cellvibrio japonicus involved in xyloglucan saccharification. Biochem J 2011, 436(3):567-580.
- [30]Yoshida E, Hidaka M, Fushinobu S, Koyanagi T, Minami H, Tamaki H, Kitaoka M, Katayama T, Kumagai H: Role of a PA14 domain in determining substrate specificity of a glycoside hydrolase family 3 beta-glucosidase from Kluyveromyces marxianus. Biochem J 2010, 431(1):39-49.
- [31]Rigden DJ, Mello LV, Galperin MY: The PA14 domain, a conserved all-beta domain in bacterial toxins, enzymes, adhesins and signaling molecules. Trends Biochem Sci 2004, 29(7):335-339.
- [32]Kall L, Krogh A, Sonnhammer EL: A combined transmembrane topology and signal peptide prediction method. J Mol Biol 2004, 338(5):1027-1036.
- [33]Huang H, Zhang R, Fu D, Luo J, Li Z, Luo H, Shi P, Yang P, Diao Q, Yao B: Diversity, abundance and characterization of ruminal cysteine phytases suggest their important role in phytate degradation. Environ Microbiol 2011, 13(3):747-757.
- [34]Matthews BW: Solvent content of protein crystals. J Mol Biol 1968, 33(2):491-497.
- [35]Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC: MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 2010, 66(Pt 1):12-21.
- [36]Soding J: Protein homology detection by HMM-HMM comparison. Bioinformatics 2005, 21(7):951-960.
- [37]Soding J, Biegert A, Lupas AN: The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 2005, 33(Web Server issue):W244-W248.
- [38]Andreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 2008, 36(Database issue):D419-D425.
- [39]Krissinel E, Henrick K: Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 2004, 60(Pt 12 Pt 1):2256-2268.
- [40]Liang J, Edelsbrunner H, Woodward C: Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design. Protein Sci 1998, 7(9):1884-1897.
- [41]Laskowski RA, Watson JD, Thornton JM: ProFunc: a server for predicting protein function from 3D structure. Nucleic Acids Res 2005, 33(Web Server issue):W89-W93.
- [42]Watson JD, Milner White EJ: A novel main-chain anion-binding site in proteins: the nest. A particular combination of phi,psi values in successive residues gives rise to anion-binding sites that occur commonly and are found often at functionally important regions. J Mol Biol 2002, 315(2):171-182.
- [43]Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N: ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 2003, 19(1):163-164.
- [44]Ashkenazy H, Erez E, Martz E, Pupko T, Ben Tal N: ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res 2010, 38(Web Server issue):W529-W533.
- [45]Milkowski C, Strack D: Serine carboxypeptidase-like acyltransferases. Phytochemistry 2004, 65(5):517-524.
- [46]Ekici OD, Paetzel M, Dalbey RE: Unconventional serine proteases: variations on the catalytic Ser/His/Asp triad configuration. Protein Sci 2008, 17(12):2023-2037.
- [47]Buller AR, Townsend CA: Intrinsic evolutionary constraints on protease structure, enzyme acylation, and the identity of the catalytic triad. Proc Natl Acad Sci U S A 2013, 110(8):E653-E661.
- [48]Kinoshita K, Sadanami K, Kidera A, Go N: Structural motif of phosphate-binding site common to various protein superfamilies: all-against-all structural comparison of protein-mononucleotide complexes. Protein Eng 1999, 12(1):11-14.
- [49]Ausiello G, Peluso D, Via A, Helmer Citterich M: Local comparison of protein structures highlights cases of convergent evolution in analogous functional sites. BMC Bioinformatics 2007, 8(Suppl 1):S24.
- [50]Nguyen MN, Madhusudhan MS: Biological insights from topology independent comparison of protein 3D structures. Nucleic Acids Res 2011, 39(14):e94.
- [51]Porter CT, Bartlett GJ, Thornton JM: The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 2004, 32(Database issue):D129-D133.
- [52]Nadzirin N, Gardiner EJ, Willett P, Artymiuk PJ, Firdaus Raih M: SPRITE and ASSAM: web servers for side chain 3D-motif searching in protein structures. Nucleic Acids Res 2012, 40(Web Server issue):W380-W386.
- [53]Stark A, Russell RB: Annotation in three dimensions. PINTS: Patterns in Non-homologous Tertiary Structures. Nucleic Acids Res 2003, 31(13):3341-3344.
- [54]Sakon J, Irwin D, Wilson DB, Karplus PA: Structure and mechanism of endo/exocellulase E4 from Thermomonospora fusca. Nat Struct Biol 1997, 4(10):810-818.
- [55]Quiocho FA VN: Atomic interactions between proteins/enzymes and carbohydrates. In Bioinorganic chemistry: carbohydrates. Edited by Hecht SM. New York, NY: Oxford University Press; 1999:441-457.
- [56]Malik A, Ahmad S: Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network. BMC Struct Biol 2007, 7:1.
- [57]Laughrey ZR, Kiehna SE, Riemen AJ, Waters ML: Carbohydrate-pi interactions: what are they worth? J Am Chem Soc 2008, 130(44):14625-14633.
- [58]Schmidt A, Gubitz GM, Kratky C: Xylan binding subsite mapping in the xylanase from Penicillium simplicissimum using xylooligosaccharides as cryo-protectant. Biochemistry 1999, 38(8):2403-2412.
- [59]Li Y, Irwin DC, Wilson DB: Processivity, substrate binding, and mechanism of cellulose hydrolysis by Thermobifida fusca Cel9A. Appl Environ Microbiol 2007, 73(10):3165-3172.
- [60]Wei Y, Ko J, Murga LF, Ondrechen MJ: Selective prediction of interaction sites in protein structures with THEMATICS. BMC Bioinformatics 2007, 8:119.
- [61]Somarowthu S, Ondrechen MJ: POOL server: machine learning application for functional site prediction in proteins. Bioinformatics 2012, 28(15):2078-2079.
- [62]Zhu Y, Suits MD, Thompson AJ, Chavan S, Dinev Z, Dumon C, Smith N, Moremen KW, Xiang Y, Siriwardena A, Williams SJ, Gilbert HJ, Davies GJ: Mechanistic insights into a Ca2 + -dependent family of alpha-mannosidases in a human gut symbiont. Nat Chem Biol 2010, 6(2):125-132.
- [63]Thompson AJ, Williams RJ, Hakki Z, Alonzi DS, Wennekes T, Gloster TM, Songsrirote K, Thomas-Oates JE, Wrodnigg TM, Spreitz J, Stutz AE, Butters TD, Williams SJ, Davies GJ: Structural and mechanistic insight into N-glycan processing by endo-alpha-mannosidase. Proc Natl Acad Sci U S A 2012, 109(3):781-786.
- [64]Miroshnichenko ML, Kostrikina NA, Chernyh NA, Pimenov NV, Tourova TP, Antipov AN, Spring S, Stackebrandt E, Bonch Osmolovskaya EA: Caldithrix abyssi gen. nov., sp. nov., a nitrate-reducing, thermophilic, anaerobic bacterium isolated from a Mid-Atlantic Ridge hydrothermal vent, represents a novel bacterial lineage. Int J Syst Evol Microbiol 2003, 53(Pt 1):323-329.
- [65]Elsliger MA, Deacon AM, Godzik A, Lesley SA, Wooley J, Wuthrich K, Wilson IA: The JCSG high-throughput structural biology pipeline. Acta Crystallogr Sect F: Struct Biol Cryst Commun 2010, 66(Pt 10):1137-1142.
- [66]van den Bedem H, Wolf G, Xu Q, Deacon AM: Distributed structure determination at the JCSG. Acta Crystallogr D Biol Crystallogr 2011, 67(Pt 4):368-375.
- [67]Kabsch W: Xds. Acta Crystallogr D Biol Crystallogr 2010, 66(Pt 2):125-132.
- [68]Sheldrick GM: A short history of SHELX. Acta Crystallogr A 2008, 64(Pt 1):112-122.
- [69]Smart OS, Womack TO, Flensburg C, Keller P, Paciorek W, Sharff A, Vonrhein C, Bricogne G: Exploiting structure similarity in refinement: automated NCS and target-structure restraints in BUSTER. Acta Crystallogr D Biol Crystallogr 2012, 68(Pt 4):368-380.
- [70]Winn MD, Isupov MN, Murshudov GN: Use of TLS parameters to model anisotropic displacements in macromolecular refinement. Acta Crystallogr D Biol Crystallogr 2001, 57(Pt 1):122-133.
- [71]Rose PW, Bi C, Bluhm WF, Christie CH, Dimitropoulos D, Dutta S, Green RK, Goodsell DS, Prlic A, Quesada M, Quinn GB, Ramos AG, Westbrook JD, Young J, Zardecki C, Berman HM, Bourne PE: The RCSB Protein Data Bank: new resources for research and education. Nucleic Acids Res 2012.
- [72]Eddy SR: A new generation of homology search tools based on probabilistic inference. Genome Inform 2009, 23(1):205-211.
- [73]Johnson LS, Eddy SR, Portugaly E: Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 2010, 11:431.
- [74]Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH: UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 2007, 23(10):1282-1288.
- [75]Katoh K, Standley DM: MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 2013, 30(4):772-780.
- [76]Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ: Jalview Version 2–a multiple sequence alignment editor and analysis workbench. Bioinformatics 2009, 25(9):1189-1191.
- [77]Saitou N, Nei M: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 1987, 4(4):406-425.
- [78]Felsenstein J: Confidence limits on phylogenies: An approach using the bootstrap. Evolution 1985, 39:783-791.
- [79]Zuckerkandl E, Pauling L: Evolutionary divergence and convergence in proteins. In Evolving Genes and Proteins. Edited by Bryson V, Vogel HJ. New York: Academic Press; 1965:97-166.
- [80]Letunic I, Bork P: Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res 2011, 39(Web Server issue):W475-W478.
- [81]Holm L, Sander C: Protein structure comparison by alignment of distance matrices. J Mol Biol 1993, 233(1):123-138.
- [82]Gouet P, Courcelle E, Stuart DI, Metoz F: ESPript: analysis of multiple sequence alignments in PostScript. Bioinformatics 1999, 15(4):305-308.