BioData Mining | |
Accurate prediction of major histocompatibility complex class II epitopes by sparse representation via ℓ1-minimization | |
Clemente Aguilar-Bonavides1  Reinaldo Sanchez-Arias2  Cristina Lanzas3  | |
[1] National Institute for Mathematical and Biological Synthesis, University of Tennessee, 37996-3410 Knoxville, TN, USA | |
[2] Department of Applied Mathematics, Wentworth Institute of Technology, 02115 Boston, MA, USA | |
[3] Department of Biomedical and Diagnostic Sciences, University of Tennessee, 37996-3410 Knoxville, TN, USA | |
关键词: Classification algorithms; Machine learning; Immunoinformatics; Epitope prediction; MHC class II; Sparse representation; Peptide binding; | |
Others : 1083992 DOI : 10.1186/1756-0381-7-23 |
|
received in 2014-03-03, accepted in 2014-10-25, 发布年份 2014 | |
【 摘 要 】
Background
The major histocompatibility complex (MHC) is responsible for presenting antigens (epitopes) on the surface of antigen-presenting cells (APCs). When pathogen-derived epitopes are presented by MHC class II on an APC surface, T cells may be able to trigger an specific immune response. Prediction of MHC-II epitopes is particularly challenging because the open binding cleft of the MHC-II molecule allows epitopes to bind beyond the peptide binding groove; therefore, the molecule is capable of accommodating peptides of variable length. Among the methods proposed to predict MHC-II epitopes, artificial neural networks (ANNs) and support vector machines (SVMs) are the most effective methods. We propose a novel classification algorithm to predict MHC-II called sparse representation via ℓ1-minimization.
Results
We obtained a collection of experimentally confirmed MHC-II epitopes from the Immune Epitope Database and Analysis Resource (IEDB) and applied our ℓ1-minimization algorithm. To benchmark the performance of our proposed algorithm, we compared our predictions against a SVM classifier. We measured sensitivity, specificity abd accuracy; then we used Receiver Operating Characteristic (ROC) analysis to evaluate the performance of our method. The prediction performance of MHC-II epitopes of the ℓ1-minimization algorithm was generally comparable and, in some cases, superior to the standard SVM classification method and overcame the lack of robustness of other methods with respect to outliers. While our method consistently favoured DPPS encoding with the alleles tested, SVM showed a slightly better accuracy when “11-factor” encoding was used.
Conclusions
ℓ1-minimization has similar accuracy than SVM, and has additional advantages, such as overcoming the lack of robustness with respect to outliers. With ℓ1-minimization no model selection dependency is involved.
【 授权许可】
2014 Aguilar-Bonavides et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150113143353344.pdf | 736KB | download | |
Figure 5. | 106KB | Image | download |
Figure 4. | 67KB | Image | download |
Figure 3. | 73KB | Image | download |
Figure 2. | 72KB | Image | download |
Figure 1. | 104KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
【 参考文献 】
- [1]Wang P, Sidney J, Dow C, Sette A, Peters B, Mothé B: A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach. PLoS Comput Biol 2008, 4(4):e1000048. doi:10.1371/journal.pcbi.1000048
- [2]Lundegaard C, Lund O, Kesmir C, Brunak S, Nielsen M: Modeling the adaptive immune system: predictions and simulations. Bioinformatics 2007, 23:3265-3275.
- [3]Patronov A, Dimitrov I, Flower D, Doytchinova I: Peptide binding prediction for the human class II MHC Allele HLA-DP2: a molecular docking approach. BMC Struct Biol 2011, 11:32. BioMed Central Full Text
- [4]Nielsen M, Lundegaard C, Worning P, Hvid C, Lamberth K, Buus S, Brunak S, Lund O: Improved prediction of MHC class I and class II Epitopes using a novel Gibbs sampling approach. Bioinformatics 2004, 20:1388-1397.
- [5]Bhasin M, Raghava G: SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence. Bioinformatics 2004, 20:421-423.
- [6]Nielsen M, Justesen S, Lund O, Lundegaard C, Buus S: NetMHCIIpan-2.0 - Improved Pan-Specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure. Immunome Res 2010, 6:9. BioMed Central Full Text
- [7]Wu KP, Wang SD: Choosing the kernel parameters for support vector machines by the inter-cluster distance in the feature space. Pattern Recognit 2009, 42:710-717.
- [8]Sanchez-Arias R: A convex optimization algorithm for sparse representation and applications in classification problems. PhD thesis. The University of Texas at El Paso; 2013
- [9]Nielsen M, Lundegaard C, Blicher T, Peters B, Sette A, Justesen S, Buus S, Lund O: Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan. PLoS Comput Biol 2008, 4:e1000107. doi:10.1371/journal.pcbi.1000107
- [10]Saethang T, Hirose O, Kimkong I, Tran V, Dang X, Nguyen L, Le T, Kubo M, Yamada Y, Satou K: EpicCapo: Epitope prediction using combined information of amino acid pairwise contact potentials and HLA-peptide contact site information. BMC Bioinformatics 2012, 13:313. BioMed Central Full Text
- [11]Kim Y, Ponomarenko J, Zhu Z, Tamang D, Wang P, Greenbaum J, Lundegaard C, Sette A, Lund O, Bourne P, Nielsen M, Peters B: Immune epitope database analysis resource. Nucleic Acids Res 2012, 40:525-530.
- [12]Liu W, Meng X, Xu Q, Flower D, Li T: Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models. BMC Bioinformatics 2006, 7:182. BioMed Central Full Text
- [13]Doytchinova I, Flower D: Physicochemical explanation of peptide binding to HLA-A*0201 major histocompatibility complex: a three-dimensional quantitative structure-activity relationship study. Proteins 2002, 48:505-518.
- [14]Tian F, Yang L, Lv F, Yang Q, Zhou P: In Silico quantitative prediction of peptides binding affinity to human MHC molecule: an intuitive quantitative structure-activity relationship approach. Amino Acids 2009, 36:535-554.
- [15]Cortes C, Vapnik V: Support vector networks. Mach Learn 1995, 20:273-297.
- [16]Wang L: Support Vector Machines: Theory and Applications, Volume 177 of Studies in Fuzziness and Soft Computing. Heidelberg, Germany: Springer Berlin; 2005.
- [17]Nielsen M, Lund O: NN-align. an artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC Bioinformatics 2009, 10:296. BioMed Central Full Text
- [18]Reche P, Glutting J, Reinherz E: Prediction of MHC class I binding peptides using profile motifs. Hum Immunol 2002, 63:701-709.
- [19]Tung C, Ziehm M, Kämper A, Kohlbacher O, Ho S: POPISK: T-Cell reactivity prediction using support vector machines and string kernels. BMC Bioinformatics 2011, 12:446. BioMed Central Full Text
- [20]Yang J, Zhang L, Zu Y, Yang JY: Beyond sparsity: the role of l1-optimizer in pattern classification. Pattern Recognit 2012, 45:1104-1118.