2nd Annual International Conference on Information System and Artificial Intelligence | |
Predicting protein amidation sites by orchestrating amino acid sequence features | |
物理学;计算机科学 | |
Zhao, Shuqiu^1,2 ; Yu, Hua^1,2 ; Gong, Xiujun^1,2 | |
School of Computer Science and Technology, Tianjin University, Nankai Tianjin | |
30072, China^1 | |
Tianjin Key Laboratory of Cognitive Computing and Application, Nankai Tianjin | |
30072, China^2 | |
关键词: Amino acid sequence; Correlation coefficient; Experimental methods; Feature extraction methods; Pathological process; Physicochemical property; Post-translational modifications; Support vector machine classifiers; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/887/1/012052/pdf DOI : 10.1088/1742-6596/887/1/012052 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
![]() |
【 摘 要 】
Amidation is the fourth major category of post-translational modifications, which plays an important role in physiological and pathological processes. Identifying amidation sites can help us understanding the amidation and recognizing the original reason of many kinds of diseases. But the traditional experimental methods for predicting amidation sites are often time-consuming and expensive. In this study, we propose a computational method for predicting amidation sites by orchestrating amino acid sequence features. Three kinds of feature extraction methods are used to build a feature vector enabling to capture not only the physicochemical properties but also position related information of the amino acids. An extremely randomized trees algorithm is applied to choose the optimal features to remove redundancy and dependence among components of the feature vector by a supervised fashion. Finally the support vector machine classifier is used to label the amidation sites. When tested on an independent data set, it shows that the proposed method performs better than all the previous ones with the prediction accuracy of 0.962 at the Matthew's correlation coefficient of 0.89 and area under curve of 0.964.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Predicting protein amidation sites by orchestrating amino acid sequence features | 364KB | ![]() |