期刊论文详细信息
BMC Structural Biology
A PDB-wide, evolution-based assessment of protein–protein interfaces
Guido Capitani3  Spencer Bliven2  Nikhil Biyani3  Jose M Duarte1  Kumaran Baskaran3 
[1] Institute of Molecular Biology and Biophysics, ETH Zürich, Zürich 8093, Switzerland;National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda 20894, MD, USA;Laboratory of Biomolecular Research, Paul Scherrer Institute, OFLC/110, Villigen PSI 5232, Switzerland
关键词: PDB;    PISA;    EPPIC;    Crystal contacts;    Biological interfaces;    Protein–protein interfaces;   
Others  :  1090688
DOI  :  10.1186/s12900-014-0022-0
 received in 2014-07-31, accepted in 2014-09-25,  发布年份 2014
PDF
【 摘 要 】

Background

Thanks to the growth in sequence and structure databases, more than 50 million sequences are now available in UniProt and 100,000 structures in the PDB. Rich information about protein–protein interfaces can be obtained by a comprehensive study of protein contacts in the PDB, their sequence conservation and geometric features.

Results

An automated computational pipeline was developed to run our Evolutionary Protein–Protein Interface Classifier (EPPIC) software on the entire PDB and store the results in a relational database, currently containing > 800,000 interfaces. This allows the analysis of interface data on a PDB-wide scale. Two large benchmark datasets of biological interfaces and crystal contacts, each containing about 3000 entries, were automatically generated based on criteria thought to be strong indicators of interface type. The BioMany set of biological interfaces includes NMR dimers solved as crystal structures and interfaces that are preserved across diverse crystal forms, as catalogued by the Protein Common Interface Database (ProtCID) from Xu and Dunbrack. The second dataset, XtalMany, is derived from interfaces that would lead to infinite assemblies and are therefore crystal contacts. BioMany and XtalMany were used to benchmark the EPPIC approach. The performance of EPPIC was also compared to classifications from the Protein Interfaces, Surfaces, and Assemblies (PISA) program on a PDB-wide scale, finding that the two approaches give the same call in about 88% of PDB interfaces. By comparing our safest predictions to the PDB author annotations, we provide a lower-bound estimate of the error rate of biological unit annotations in the PDB. Additionally, we developed a PyMOL plugin for direct download and easy visualization of EPPIC interfaces for any PDB entry. Both the datasets and the PyMOL plugin are available at http://www.eppic-web.org/ewui/#downloads webcite.

Conclusions

Our computational pipeline allows us to analyze protein–protein contacts and their sequence conservation across the entire PDB. Two new benchmark datasets are provided, which are over an order of magnitude larger than existing manually curated ones. These tools enable the comprehensive study of several aspects of protein–protein contacts in the PDB and represent a basis for future, even larger scale studies of protein–protein interactions.

【 授权许可】

   
2014 Baskaran et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150128162655705.pdf 1278KB PDF download
Figure 9. 24KB Image download
Figure 8. 25KB Image download
Figure 7. 33KB Image download
Figure 6. 112KB Image download
Figure 5. 23KB Image download
Figure 4. 62KB Image download
Figure 3. 65KB Image download
Figure 2. 48KB Image download
Figure 1. 76KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

【 参考文献 】
  • [1]Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The protein data bank. Nucleic Acids Res 2000, 28:235-242,. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=102472&tool=pmcentrez&rendertype=abstract]
  • [2][http://doi.wiley.com/10.1002/prot.22787] webcite Schärer Ma, Grütter MG, Capitani G: CRK: An evolutionary approach for distinguishing biologically relevant interfaces from crystal contacts. Proteins: Struct Funct Bioinformatics2010,[]
  • [3]Duarte JM, Srebniak A, Capitani G: Protein interface classification by evolutionary analysis. BMC Bioinformatics 2012, 13:334,. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3556496&tool=pmcentrez&rendertype=abstract] BioMed Central Full Text
  • [4]Duarte JM, Biyani N, Baskaran K, Capitani G: An analysis of oligomerization interfaces in transmembrane proteins. BMC Struct Biol 2013, 13:21. [http://www.ncbi.nlm.nih.gov/pubmed/24134166] BioMed Central Full Text
  • [5]Ivan G, Szabadka Z, Grolmusz V: A hybrid clustering of protein binding sites. Febs J 2010, 277(6):1494-1502. [http://www.ncbi.nlm.nih.gov/pubmed/20148971]
  • [6][http://ukpmc.ac.uk/abstract/MED/9406542] webcite Janin J: Specific versus non-specific contacts in protein crystals. Nat Struct Biol1997,[]
  • [7]Update on activities at the Universal Protein Resource (UniProt) in 2013 Nucleic Acids Res 2013, 41(Database issue):D43-7,. [http://www.ncbi.nlm.nih.gov/pubmed/23161681]
  • [8]Ponstingl H, Kabir T, Thornton JM: Automatic inference of protein quaternary structure from crystals. J Appl Cryst 2003, 36(5):1116-1122,. [http://scripts.iucr.org/cgi-bin/paper?S0021889803012421]
  • [9][http://www.sciencedirect.com/science/article/pii/S0022283607006420] webcite Krissinel E, Henrick K: Inference of macromolecular assemblies from crystalline state. J Mol Biol2007,[]
  • [10]Xu Q, Wang G, Shapovalov M, Obradovic Z, Dunbrack RL: Statistical analysis of interface similarity in crystals of homologous proteins. J Mol Biol 2008, 381(2):487-507,. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2573399&tool=pmcentrez&rendertype=abstract]
  • [11]Xu Q, Dunbrack RL: The protein common interface database (ProtCID)–a comprehensive database of interactions of homologous proteins in multiple crystal forms. Nucleic Acids Res 2011, 39(Database issue):D761-70,. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3013667&tool=pmcentrez&rendertype=abstract]
  • [12]Monod J, Wyman J, Changeux JP: On the nature of allosteric transitions: A plausible model. J Mol Biol 1965, 12:88-118,. [http://linkinghub.elsevier.com/retrieve/pii/S0022283665802856]
  • [13]Levy ED, Teichmann S: Structural, evolutionary, and assembly principles of protein oligomerization. Prog Mol Biol Transl Sci 2013, 117:25-51,. [http://dx.doi.org/10.1016/B978-0-12-386931-9.00002-7]
  • [14]Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, Keegan RM, Krissinel EB, Leslie AG, McCoy A, et al.: Overview of the CCP4 suite and current developments. Acta Crystallogr Section D: Biol Crystallogr 2011, 67(4):235-242.
  • [15]Krissinel E: Macromolecular complexes in crystals and solutions. Acta Crystallogr Section D 2011, 67(4):376-385,. [http://dx.doi.org/10.1107/S0907444911007232]
  • [16]Bahadur RP, Chakrabarti P, Rodier F, Janin J: A dissection of specific and non-specific protein-protein interfaces. J Mol Biol 2004, 336(4):943-955.
  • [17]Dauter Z, Wlodawer A, Minor W, Jaskolski M, Rupp B: Avoidable errors in deposited macromolecular structures: an impediment to efficient data mining. IUCrJ 2014, 1(3):179-193,. [http://dx.doi.org/10.1107/S2052252514005442]
  • [18]Levy ED, Pereira-Leal JB, Chothia C, Teichmann SA: 3D Complex: A structural classification of protein complexes. PLoS Comput Biol 2006, 2(11):e155,. [http://dx.plos.org/10.1371%2Fjournal.pcbi.0020155]
  • [19]Levy ED: PiQSi: protein quaternary structure investigation. Struct (London, England : 1993) 2007, 15(11):1364-7,. [http://www.ncbi.nlm.nih.gov/pubmed/17997962]
  • [20]Poupon A, Janin J: Analysis and prediction of protein quaternary structure. Methods Mol Biol Clifton Nj 2010, 609:349-364,. [http://www.springerlink.com/index/10.1007/978-1-60327-241-4_20]
  • [21]Banatao DR, Cascio D, Crowley CS, Fleissner MR, Tienson HL, Yeates TO: An approach to crystallizing proteins by synthetic symmetrization. Proc Nat Acad Sci USA 2006, 103(44):16230-16235,. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1637565&tool=pmcentrez&rendertype=abstract] [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1637565/]
  • [22]Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucliec Acids Res 1997, 25(17):3389-3402,. [http://dx.doi.org/10.1093/nar/25.17.3389]
  文献评价指标  
  下载次数:100次 浏览次数:24次