期刊论文详细信息
BMC Bioinformatics
A two-layered machine learning method to identify protein O-GlcNAcylation sites with O-GlcNAc transferase substrate motifs
Research
Hui-Ju Kao1  Kai-Yao Huang1  Cheng-Tsung Lu1  Tzong-Yi Lee2  Chien-Hsun Huang3  Shun-Long Weng4  Neil Arvin Bretaña5 
[1] Department of Computer Science and Engineering, Yuan Ze University, 320, Taoyuan, Taiwan;Department of Computer Science and Engineering, Yuan Ze University, 320, Taoyuan, Taiwan;Innovation Center for Big Data and Digital Convergence, Yuan Ze University, 320, Taoyuan, Taiwan;Department of Computer Science and Engineering, Yuan Ze University, 320, Taoyuan, Taiwan;Tao-Yuan Hospital, Ministry of Health & Welfare, 320, Taoyuan, Taiwan;Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, 300, Hsin-Chu, Taiwan;Mackay Junior College of Medicine, Nursing and Management, 112, Taipei, Taiwan;Department of Medicine, Mackay Medical College, 252, New Taipei City, Taiwan;Inflammation and Infection Research Centre, School of Medical Sciences, University of New South Wales, Sydney, Australia;
关键词: O-GlcNAcylation;    O-linked glycosylation;    O-GlcNAc transferase (OGT);    substrate motif;    profile hidden Markov model;    support vector machine;   
DOI  :  10.1186/1471-2105-16-S18-S10
来源: Springer
PDF
【 摘 要 】

Protein O-GlcNAcylation, involving the β-attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues, is an O-linked glycosylation catalyzed by O-GlcNAc transferase (OGT). Molecular level investigation of the basis for OGT's substrate specificity should aid understanding how O-GlcNAc contributes to diverse cellular processes. Due to an increasing number of O-GlcNAcylated peptides with site-specific information identified by mass spectrometry (MS)-based proteomics, we were motivated to characterize substrate site motifs of O-GlcNAc transferases. In this investigation, a non-redundant dataset of 410 experimentally verified O-GlcNAcylation sites were manually extracted from dbOGAP, OGlycBase and UniProtKB. After detection of conserved motifs by using maximal dependence decomposition, profile hidden Markov model (profile HMM) was adopted to learn a first-layered model for each identified OGT substrate motif. Support Vector Machine (SVM) was then used to generate a second-layered model learned from the output values of profile HMMs in first layer. The two-layered predictive model was evaluated using a five-fold cross validation which yielded a sensitivity of 85.4%, a specificity of 84.1%, and an accuracy of 84.7%. Additionally, an independent testing set from PhosphoSitePlus, which was really non-homologous to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (84.05%) and outperform other O-GlcNAcylation site prediction tools. A case study indicated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation and has been implemented as a web-based system, OGTSite, which is now freely available at http://csb.cse.yzu.edu.tw/OGTSite/.

【 授权许可】

Unknown   
© Kao et al.; 2015. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

【 预 览 】
附件列表
Files Size Format View
RO202311097520219ZK.pdf 1667KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  文献评价指标  
  下载次数:3次 浏览次数:0次