期刊论文详细信息
BMC Bioinformatics
Structural genomics analysis of uncharacterized protein families overrepresented in human gut bacteria identifies a novel glycoside hydrolase
Adam Godzik3  Herbert L Axelrod5  Christian C Zmasek2  Zhanwen Li2  Yuanyuan Chang2  Daniel J Rigden4  Ruth Y Eberhardt1  Anna Sheydina2 
[1]European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
[2]Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, La Jolla, CA 92037, USA
[3]Center for Research in Biological Systems, University of California, 9500 Gilman Dr., La Jolla, CA 92093-0446, USA
[4]Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
[5]Stanford Synchrotron Radiation Lightsource, Menlo Park, CA 94025, USA
关键词: DUF;    Domain of unknown function;    Protein function prediction;    Protein family;    3D structure;    Carbohydrate metabolism;    Glycoside hydrolase;   
Others  :  818661
DOI  :  10.1186/1471-2105-15-112
 received in 2013-09-09, accepted in 2014-03-31,  发布年份 2014
PDF
【 摘 要 】

Background

Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism.

Results

BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications.

Conclusions

Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively.

【 授权许可】

   
2014 Sheydina et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140711131415633.pdf 1903KB PDF download
Figure 6. 69KB Image download
Figure 5. 157KB Image download
Figure 4. 69KB Image download
Figure 3. 170KB Image download
Figure 2. 82KB Image download
Figure 1. 115KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

【 参考文献 】
  • [1]Rosenstiel P: Stories of love and hate: innate immunity and host-microbe crosstalk in the intestine. Curr Opin Gastroenterol 2013, 29(2):125-132.
  • [2]Tasse L, Bercovici J, Pizzut-Serin S, Robe P, Tap J, Klopp C, Cantarel BL, Coutinho PM, Henrissat B, Leclerc M, Doré J, Monsan P, Remaud-Simeon M, Potocki-Veronese G: Functional metagenomics to mine the human gut microbiome for dietary fiber catabolic enzymes. Genome Res 2010, 20(11):1605-1612.
  • [3]Quiocho FA: Carbohydrate-binding proteins: tertiary structures and protein-sugar interactions. Annu Rev Biochem 1986, 55:287-315.
  • [4]Xu J, Bjursell MK, Himrod J, Deng S, Carmichael LK, Chiang HC, Hooper LV, Gordon JI: A genomic view of the human-bacteroides thetaiotaomicron symbiosis. Science 2003, 299(5615):2074-2076.
  • [5]Cantarel BL, Coutinho PM, Rancurel C, Bernard T, Lombard V, Henrissat B: The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res 2009, 37(Database issue):D233-D238.
  • [6]Henrissat B, Davies G: Structural and sequence-based classification of glycoside hydrolases. Curr Opin Struct Biol 1997, 7(5):637-644.
  • [7]Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD: The Pfam protein families database. Nucleic Acids Res 2012, 40(Database issue):D290-D301.
  • [8]Bateman A, Coggill P, Finn RD: DUFs: families in search of function. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010, 66(Pt 10):1148-1152.
  • [9]Jaroszewski L, Li Z, Krishna SS, Bakolitsa C, Wooley J, Deacon AM, Wilson IA, Godzik A: Exploration of uncharted regions of the protein universe. PLoS Biol 2009, 7(9):e1000205.
  • [10]Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC: MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 2010, 66(Pt 1):12-21.
  • [11]UniProt C: Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic acids research 2012, 40(Database issue):D71-75.
  • [12]Dusko Ehrlich S: Meta HITc: [Metagenomics of the intestinal microbiota: potential applications]. Gastroenterol Clin Biol 2010, 34(Suppl 1):S23-S28.
  • [13]Boraston AB, Bolam DN, Gilbert HJ, Davies GJ: Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem J 2004, 382(Pt 3):769-781.
  • [14]Holm L, Sander C: Dali: a network tool for protein structure comparison. Trends Biochem Sci 1995, 20(11):478-480.
  • [15]Ye Y, Godzik A: Multiple flexible structure alignment using partial order graphs. Bioinformatics 2005, 21(10):2362-2369.
  • [16]Jaroszewski L, Rychlewski L, Li Z, Li W, Godzik A: FFAS03: a server for profile--profile sequence alignments. Nucleic Acids Res 2005, 33(Web Server issue):W284-288.
  • [17]Shallom D, Shoham Y: Microbial hemicellulases. Curr Opin Microbiol 2003, 6(3):219-228.
  • [18]Maksimainen M, Paavilainen S, Hakulinen N, Rouvinen J: Structural analysis, enzymatic characterization, and catalytic mechanisms of beta-galactosidase from Bacillus circulans sp. alkalophilus. FEBS J 2012, 279(10):1788-1798.
  • [19]Correia MA, Mazumder K, Bras JL, Firbank SJ, Zhu Y, Lewis RJ, York WS, Fontes CM, Gilbert HJ: Structure and function of an arabinoxylan-specific xylanase. J Biol Chem 2011, 286(25):22510-22520.
  • [20]Santos CR, Polo CC, Correa JM, Simao Rde C, Seixas FA, Murakami MT: The accessory domain changes the accessibility and molecular topography of the catalytic interface in monomeric GH39 beta-xylosidases. Acta Crystallogr D Biol Crystallogr 2012, 68(Pt 10):1339-1345.
  • [21]Dehal PS, Joachimiak MP, Price MN, Bates JT, Baumohl JK, Chivian D, Friedland GD, Huang KH, Keller K, Novichkov PS, Dubchak IL, Alm EJ, Arkin AP: MicrobesOnline: an integrated portal for comparative and functional genomics. Nucleic Acids Res 2010, 38(Database issue):D396-400.
  • [22]Gotō M: Fundamentals of bacterial plant pathology. San Diego: Academic Press; 1992.
  • [23]Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y: dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 2012, 40(Web Server issue):W445-451.
  • [24]Elsliger MA, Deacon AM, Godzik A, Lesley SA, Wooley J, Wuthrich K, Wilson IA: The JCSG high-throughput structural biology pipeline. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010, 66(Pt 10):1137-1142.
  • [25]McPhillips TM, McPhillips SE, Chiu HJ, Cohen AE, Deacon AM, Ellis PJ, Garman E, Gonzalez A, Sauter NK, Phizackerley RP, Soltis SM, Kuhn P: Blu-Ice and the distributed control system: software for data acquisition and instrument control at macromolecular crystallography beamlines. J Synchrotron Radiat 2002, 9(Pt 6):401-406.
  • [26]Battye TG, Kontogiannis L, Johnson O, Powell HR, Leslie AG: iMOSFLM: a new graphical interface for diffraction-image processing with MOSFLM. Acta Crystallogr D Biol Crystallogr 2011, 67(Pt 4):271-281.
  • [27]The CCP4 suite: Programs for protein crystallography. Acta Crystallogr D Biol Crystallogr 1994, 50(Pt 5):760-763.
  • [28]Sheldrick GM: A short history of SHELX. Acta Crystallogr A 2008, 64(Pt 1):112-122.
  • [29]Vonrhein C, Blanc E, Roversi P, Bricogne G: Automated structure solution with autoSHARP. Methods Mol Biol 2007, 364:215-230.
  • [30]Langer G, Cohen SX, Lamzin VS, Perrakis A: Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat Protoc 2008, 3(7):1171-1179.
  • [31]Winn MD, Murshudov GN, Papiz MZ: Macromolecular TLS refinement in REFMAC at moderate resolutions. Methods Enzymol 2003, 374:300-321.
  • [32]Emsley P, Cowtan K: Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 2004, 60(Pt 12 Pt 1):2126-2132.
  • [33]Diederichs K, Karplus PA: Improved R-factors for diffraction data analysis in macromolecular crystallography. Nat Struct Biol 1997, 4(4):269-275.
  • [34]Weiss MS, Hilgenfeld R: On the use of the merging R factor as a quality indicator for X-ray data. J Appl Crystallogr 1997, 30(2):203-205.
  • [35]Weiss MS, Metzner HJ, Hilgenfeld R: Two non-proline cis peptide bonds may be important for factor XIII function. FEBS Lett 1998, 423(3):291-296.
  • [36]Cruickshank DW: Remarks about protein structure precision. Acta Crystallogr D Biol Crystallogr 1999, 55(Pt 3):583-601.
  • [37]Katoh K, Kuma K, Toh H, Miyata T: MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic acids research 2005, 33(2):511-518.
  • [38]DeLano W: The PyMOL Molecular Graphics System, Version 1.2r3pre. DeLano Scientific: San Carlos, CA; 2002.
  • [39]Gouet P, Courcelle E, Stuart DI, Metoz F: ESPript: analysis of multiple sequence alignments in PostScript. Bioinformatics 1999, 15(4):305-308.
  • [40]Joosten RP, te Beek TA, Krieger E, Hekkelman ML, Hooft RW, Schneider R, Sander C, Vriend G: A series of PDB related databases for everyday needs. Nucleic Acids Res 2011, 39(Database issue):D411-419.
  • [41]Desper R, Gascuel O: Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J Comput Biol 2002, 9(5):687-705.
  • [42]Felsenstein J: PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics 1989, 5:164-166.
  • [43]Schmidt HA, Strimmer K, Vingron M, von Haeseler A: TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 2002, 18(3):502-504.
  • [44]Han MV, Zmasek CM: phyloXML: XML for evolutionary biology and comparative genomics. BMC Bioinformatics 2009, 10:356. BioMed Central Full Text
  文献评价指标  
  下载次数:27次 浏览次数:14次