期刊论文详细信息
Algorithms
Prediction of Intrinsically Disordered Proteins Using Machine Learning Based on Low Complexity Methods
Hao He1  Xingming Zeng2  Haiyuan Liu2 
[1] Department of Communication Engineering, School of Electronic Information, Hebei University of Technology, Tianjin 300400, China;Tianjin Key Laboratory of Optoelectronic Sensor and Sensing Network Technology, School of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China;
关键词: intrinsically disordered proteins;    machine learning;    permutation entropy;    computational complexity;   
DOI  :  10.3390/a15030086
来源: DOAJ
【 摘 要 】

Prediction of intrinsic disordered proteins is a hot area in the field of bio-information. Due to the high cost of evaluating the disordered regions of protein sequences using experimental methods, we used a low-complexity prediction scheme. Sequence complexity is used in this scheme to calculate five features for each residue of the protein sequence, including the Shannon entropy, the Topo-logical entropy, the Permutation entropy and the weighted average values of two propensities. Particularly, this is the first time that permutation entropy has been applied to the field of protein sequencing. In addition, in the data preprocessing stage, an appropriately sized sliding window and a comprehensive oversampling scheme can be used to improve the prediction performance of our scheme, and two ensemble learning algorithms are also used to verify the prediction results before and after. The results show that adding permutation entropy improves the performance of the prediction algorithm, in which the MCC value can be improved from the original 0.465 to 0.526 in our scheme, proving its universality. Finally, we compare the simulation results of our scheme with those of some existing schemes to demonstrate its effectiveness.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次