BMC Systems Biology | |
Mapping the stabilome: a novel computational method for classifying metabolic protein stability | |
Mikael Bodén1  Bostjan Kobe4  Melissa Davis1  Kim-Anh Lê Cao3  Ralph Patrick2  | |
[1] Institute for Molecular Bioscience, The University of Queensland, St Lucia, Australia;School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, Australia;Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia, Australia;Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, Australia | |
关键词: Prediction; Support vector machines; Bayesian networks; Post-translational modifications; Machine learning; Degradation; Protein stability; | |
Others : 1144193 DOI : 10.1186/1752-0509-6-60 |
|
received in 2011-12-20, accepted in 2012-05-16, 发布年份 2012 |
【 摘 要 】
Background
The half-life of a protein is regulated by a range of system properties, including the abundance of components of the degradative machinery and protein modifiers. It is also influenced by protein-specific properties, such as a protein’s structural make-up and interaction partners. New experimental techniques coupled with powerful data integration methods now enable us to not only investigate what features govern protein stability in general, but also to build models that identify what properties determine each protein’s metabolic stability.
Results
In this work we present five groups of features useful for predicting protein stability: (1) post-translational modifications, (2) domain types, (3) structural disorder, (4) the identity of a protein’s N-terminal residue and (5) amino acid sequence. We incorporate these features into a predictive model with promising accuracy. At a 20% false positive rate, the model exhibits an 80% true positive rate, outperforming the only previously proposed stability predictor. We also investigate the impact of N-terminal protein tagging as used to generate the data set, in particular the impact it may have on the measurements for secreted and transmembrane proteins; we train and test our model on a subset of the data with those proteins removed, and show that the model sustains high accuracy. Finally, we estimate system-wide metabolic stability by surveying the whole human proteome.
Conclusions
We describe a variety of protein features that are significantly over- or under-represented in stable and unstable proteins, including phosphorylation, acetylation and destabilizing N-terminal residues. Bayesian networks are ideal for combining these features into a predictive model with superior accuracy and transparency compared to the only other proposed stability predictor. Furthermore, our stability predictions of the human proteome will find application in the analysis of functionally related proteins, shedding new light on regulation by protein synthesis and degradation.
【 授权许可】
2012 Patrick et al.; licensee BioMed Central Ltd.
Files | Size | Format | View |
---|---|---|---|
Figure 1. | 12KB | Image | ![]() |
Figure 5. | 55KB | Image | ![]() |
Figure 4. | 49KB | Image | ![]() |
Figure 3. | 51KB | Image | ![]() |
Figure 2. | 21KB | Image | ![]() |
Figure 1. | 49KB | Image | ![]() |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 1.
【 参考文献 】
- [1]Yen HCS, Xu Q, Chou DM, Zhao Z, Elledge SJ: Global protein stability profiling in mammalian cells. Science 2008, 322:918-923.
- [2]Doherty MK, Hammond DE, Clague MJ, Gaskell SJ, Beynon RJ: Turnover of the human proteome: determination of protein intracellular stability by dynamic SILAC. J Proteome Res 2009, 8:104-112.
- [3]Eden E, Geva-Zatorsky N, Issaeva I, Cohen A, Dekel E, Danon T, Cohen L, Mayo A, Alon U: Protein half-life dynamics in living human cells. Science 2011, 331(6018):764-768.
- [4]Belle A, Tanay A, Bitincka L, Shamir R, O’Shea EK: Quantification of protein half-lives in the budding yeast proteome. Proc Natl Acad Sci USA 2006, 103(35):13004-13009.
- [5]Hinkson I, Elias J: The dynamic state of protein turnover: It’s about time. Trends Cell Biol 2011, 21(5):293-303.
- [6]Snapp EL: Fluorescent proteins: a cell biologist’s user guide. Trends Cell Biol 2009, 19(11):649-655.
- [7]Huang T, Shi XH, Wang P, He Z, Feng KY, Hu L, Kong X, Li YX, Cai YD, Chou KC: Analysis and prediction of the metabolic stability of proteins based on their sequential features, subcellular locations and interaction networks. PLoS One 2010, 5(6):e10972.
- [8]Hochstrasser M: Ubiquitin-dependent protein degradation. Annu Rev Genet 1996, 30:405-439.
- [9]Ravid T, Hochstrasser M: Diversity of degradation signals in the ubiquitin-proteasome system. Nat Rev Mol Cell Biol 2008, 9:679-689.
- [10]Hunter T: The age of crosstalk: phosphorylation, ubiquitination, and beyond. Mol Cell 2007, 28:730-738.
- [11]Yoshida Y: A novel role for N-glycans in the ERAD system. J Biochem 2003, 134:183-190.
- [12]Varshavsky A: The N-end rule pathway of protein degradation. Genes Cells 1997, 2:13-28.
- [13]Hwang CS, Shemorry A, Varshavsky A: N-Terminal acetylation of cellular proteins creates specific degradation signals. Science 2011, 327:973-977.
- [14]Tompa P, Prilusky J, Silman I, Sussman JL: Structural disorder serves as a weak signal for intracellular protein degradation. Proteins 2007, 71:903-909.
- [15]Edwards YJ, Lobley AE, Pentony MM, Jones DT: Insights into the regulation of intrinsically disordered proteins in the human proteome by analyzing sequence and gene expression data. Genome Biol 2009, 10(5):R50.
- [16]Rogers S, Wells R, Rechsteiner M: Amino acid sequences common to rapidly degrading proteins: the PEST hypothesis. Science 1986, 234:364-368.
- [17]Schwaighofer A, Schroeter T, Mika S, Hansen K, ter Laak A, Lienau P, Reichel A, Heinrich N, Müller KR: A probabilistic approach to classifying metabolic stability. J Chem Inf Model 2008, 48:785-796.
- [18]Hanchuan Peng FL, Ding C: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005, 27(8):1226-1238.
- [19]Ambroise C, McLachlan GJ: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci U S A 2002, 99(10):6562-6566.
- [20]Bauer DC, Willadsen K, Buske FA, Cao KAL, Bailey TL, Dellaire G, Boden M: Sorting the nuclear proteome. Bioinformatics 2011, 27(13):i7-i14.
- [21]Mehdi A, Sehgai M, Kobe B, Bailey T, Boden M: A probabilistic model of nuclear import of proteins. Bioinformatics 2011, 27(9):1239-1246.
- [22]Do CB, Batzoglou S: What is the expectation maximization algorithm. Nat Biotechnol 2008, 26(8):897-899.
- [23]Leslie C, Eskin E, Noble WS: The spectrum kernel: a string kernel for SVM protein classification. Pac Symp Biocomput 2002, 7:566-575.
- [24]Yewdell J, Lacsina J, Rechsteiner M, CV CN: Out with the old, in with the new? Comparing methods for measuring protein degradation. Cell Biol Int 2011, 35(5):457-462.
- [25]Vogel C, de Sousa Abreu R, Ko D, Le SY, Shapiro BA, Burns SC, Sandhu D, Boutz DR, Marcotte EM, Penalva LO: Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol 2010, 6:400.
- [26]Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB: Protein disorder prediction: implications for structural proteomics. Structure 2003, 11:1453-1459.
- [27]Baldi P, Brunak S, Chauvin Y, Anderson CAF, Nielsen H: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000, 16(5):412-424.
- [28]Koch C, Anderson D, Moran M, Ellis C, Pawson T: SH2 and SH3 domains: elements that control interactions of cytoplasmic signaling proteins. Science 1991, 252(5006):668-674.
- [29]Mayer TU, Braun T, Jentsch S: Role of the proteasome in membrane extraction of a short-lived ER-transmembrane protein. EMBO J 1998, 17(12):3251-3257.
- [30]Prasad TK, Kandasamy K, Pandey A: Human protein reference database and human proteinpedia as discovery tools for systems biology. Methods Mol Biol 2009, 577:67-79.
- [31]Mogk A, Schmidt R, Bukau B: The N-end rule pathway for regulated proteolysis: prokaryotic and eukaryotic strategies. Trends Cell Biol 2007, 17:165-172.
- [32]Emanuelsson O, Brunak S, von Heijne G, Nielson H: Locating proteins in the cell using TargetP, SignalP and related tools. Nature Protoc 2007, 2:953-971.
- [33]Davis MJ, Zhang F, Yuan Z, Teasdale RD: MemO: A consensus approach to the annotation of a protein’s membrane organization. In Silico Biol 2006, 6(5):387-399.
- [34]Amanchy R, Periaswamy B, Mathivanan S, Reddy R, Tattikota SG, Pandey A: A curated compendium of phosphorylation motifs. Nat Biotechnol 2007, 25:285-286.
- [35]Dinkel H, Chica C, Via A, Gould CM, Jensen LJ, Gibson TJ, Diella F: Phospho.ELM: a database of phosphorylation sites - udpate 2011. Nucleic Acids Res 2010, 39:1-7.