| BMC Bioinformatics | |
| A Bayesian model for classifying all differentially expressed proteins simultaneously in 2D PAGE gels | |
| Steven H Wu1  Michael A Black3  Robyn A North2  Allen G Rodrigo4  | |
| [1] Biology Department, Duke University, Duke Box, 90338, Durham, NC, 27708, USA | |
| [2] Women's Health Academic Centre, King’s College London, London, UK | |
| [3] Department of Biochemistry, University of Otago, P. O. Box 56, Dunedin, New Zealand | |
| [4] The National Evolutionary Synthesis Center, Durham, NC, 27705, USA | |
| 关键词: Markov chain Monte Carlo (MCMC); Differentially expressed protein; Global Bayesian model; Two-dimensional polyacrylamide gel electrophoresis (2D PAGE); | |
| Others : 1088233 DOI : 10.1186/1471-2105-13-137 |
|
| received in 2011-12-14, accepted in 2012-05-30, 发布年份 2012 | |
PDF
|
|
【 摘 要 】
Background
Two-dimensional polyacrylamide gel electrophoresis (2D PAGE) is commonly used to identify differentially expressed proteins under two or more experimental or observational conditions. Wu et al (2009) developed a univariate probabilistic model which was used to identify differential expression between Case and Control groups, by applying a Likelihood Ratio Test (LRT) to each protein on a 2D PAGE. In contrast to commonly used statistical approaches, this model takes into account the two possible causes of missing values in 2D PAGE: either (1) the non-expression of a protein; or (2) a level of expression that falls below the limit of detection.
Results
We develop a global Bayesian model which extends the previously described model. Unlike the univariate approach, the model reported here is able treat all differentially expressed proteins simultaneously. Whereas each protein is modelled by the univariate likelihood function previously described, several global distributions are used to model the underlying relationship between the parameters associated with individual proteins. These global distributions are able to combine information from each protein to give more accurate estimates of the true parameters. In our implementation of the procedure, all parameters are recovered by Markov chain Monte Carlo (MCMC) integration. The 95% highest posterior density (HPD) intervals for the marginal posterior distributions are used to determine whether differences in protein expression are due to differences in mean expression intensities, and/or differences in the probabilities of expression.
Conclusions
Simulation analyses showed that the global model is able to accurately recover the underlying global distributions, and identify more differentially expressed proteins than the simple application of a LRT. Additionally, simulations also indicate that the probability of incorrectly identifying a protein as differentially expressed (i.e., the False Discovery Rate) is very low. The source code is available at https://github.com/stevenhwu/BIDE-2D webcite.
【 授权许可】
2012 Wu et al.; licensee BioMed Central Ltd.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 20150117085937662.pdf | 676KB | ||
| Figure 8. | 16KB | Image | |
| Figure 7. | 15KB | Image | |
| Figure 6. | 32KB | Image | |
| Figure 5. | 18KB | Image | |
| Figure 4. | 15KB | Image | |
| Figure 3. | 15KB | Image | |
| Figure 2. | 20KB | Image | |
| Figure 1. | 30KB | Image |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
【 参考文献 】
- [1]O'Farrell PH: High resolution two-dimensional electrophoresis of proteins. J Biol Chem 1975, 250(10):4007-4021.
- [2]Morris JS, Baladandayuthapani V, Herrick RC, Sanna P, Gutstein H: Automated analysis of quantitative image data using isomorphic functional mixed models, with application to proteomics data. The Annals of Applied Statistics 2011, 5:894-923.
- [3]Dowsey AW, Dunn MJ, Yang G-Z: The role of bioinformatics in two-dimensional gel electrophoresis. Proteomics 2003, 3(8):1567-1596.
- [4]Berth M, Moser FM, Kolbe M, Bernhardt J: The state of the art in the analysis of two-dimensional gel electrophoresis images. Appl Microbiol Biotechnol 2007, 76(6):1223-1243.
- [5]Chang J, Van Remmen H, Ward WF, Regnier FE, Richardson A, Cornell J: Processing of data generated by 2-dimensional gel electrophoresis for statistical analysis: missing data, normalization, and statistics. J Proteome Res 2004, 3(6):1210-1218.
- [6]Biron DG, Brun C, Lefevre T, Lebarbenchon C, Loxdale HD, Chevenet F, Brizard JP, Thomas F: The pitfalls of proteomics experiments without the correct use of bioinformatics tools. Proteomics 2006, 6(20):5577-5596.
- [7]Jacobsen S, Grove H, Nedenskov Jensen K, Sørensen HA, Jessen F, Hollung K, Uhlen AK, Jørgensen BM, Færgestad EM, Søndergaard I: Multivariate analysis of 2-DE protein patterns - Practical approaches. Electrophoresis 2007, 28(8):1289-1299.
- [8]Grove H, Hollung K, Uhlen AK, Martens H, Faergestad EM: Challenges related to analysis of protein spot volumes from two-dimensional gel electrophoresis as revealed by replicate gels. J Proteome Res 2006, 5(12):3399-3410.
- [9]Wu SH, Black MA, North RA, Atkinson KR, Rodrigo AG: A statistical model to identify differentially expressed proteins in 2D PAGE gels. PLoS Comput Biol 2009, 5(9):e1000509.
- [10]Wheelock ÅM, Buckpitt AR: Software-induced variance in two-dimensional gel electrophoresis image analysis. Electrophoresis 2005, 26(23):4508-4520.
- [11]Albrecht D, Kniemeyer O, Brakhage AA, Guthke R: Missing values in gel-based proteomics. Proteomics 2010, 10(6):1202-1211.
- [12]Krogh M, Fernandez C, Teilum M, Bengtsson S, James P: A probabilistic treatment of the missing spot problem in 2D gel electrophoresis experiments. J Proteome Res 2007, 6(8):3335-3343.
- [13]Hastings WK: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57(1):97-109.
- [14]Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E: Equation of State Calculations by Fast Computing Machines. J Chem Phys 1953, 21(6):1087-1092.
- [15]Atkinson K: Proteomic biomarker discovery for preeclampsia.PhD thesis. Auckland: University of Auckland; 2008.
- [16]Gelman A: Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 2006, 1:515-533.
- [17]Roberts GO, Gelman A, Gilks WR: Weak Convergence and Optimal Scaling of Random Walk Metropolis Algorithms. Ann Appl Probab 1997, 7(1):110-120.
- [18]Roberts GO, Rosenthal JS: Optimal Scaling for Various Metropolis-Hastings Algorithms. Stat Sci 2001, 16(4):351-367.
- [19]Roberts GO, Sahu SK: Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler. Journal of the Royal Statistical Society Series B (Methodological) 1997, 59(2):291-317.
- [20]Liu C, Rubin DB, Wu YN: Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika 1998, 85(4):755-770.
- [21]Rambaut A, Drummond A: Tracer v1.4.1. 2007. Available from http://beast.bio.ed.ac.uk/Tracer webcite
- [22]Binder H, Schumacher M: Incorporating pathway information into boosting estimation of high-dimensional risk prediction models. BMC Bioinforma 2009, 10(1):18. BioMed Central Full Text
- [23]Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 2001, 17(8):754-755.
- [24]Altekar G, Dwarkadas S, Huelsenbeck JP, Ronquist F: Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. Bioinformatics 2004, 20(3):407-415.
PDF