期刊论文详细信息
PeerJ
Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning
article
Guobin Li1  Xiuquan Du2  Xinlu Li1  Le Zou1  Guanhong Zhang1  Zhize Wu1 
[1]School of Artificial Intelligence and Big Data, Hefei University
[2]School of Computer Science and Technology, Anhui University
关键词: DNA binding protein prediction;    Deep learning;    Convolution neural network (CNN);    Long short-term memory network (LSTM);    Long-term dependence;    Fusion approach;   
DOI  :  10.7717/peerj.11262
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】
DNA-binding proteins (DBPs) play pivotal roles in many biological functions such as alternative splicing, RNA editing, and methylation. Many traditional machine learning (ML) methods and deep learning (DL) methods have been proposed to predict DBPs. However, these methods either rely on manual feature extraction or fail to capture long-term dependencies in the DNA sequence. In this paper, we propose a method, called PDBP-Fusion, to identify DBPs based on the fusion of local features and long-term dependencies only from primary sequences. We utilize convolutional neural network (CNN) to learn local features and use bi-directional long-short term memory network (Bi-LSTM) to capture critical long-term dependencies in context. Besides, we perform feature extraction, model training, and model prediction simultaneously. The PDBP-Fusion approach can predict DBPs with 86.45% sensitivity, 79.13% specificity, 82.81% accuracy, and 0.661 MCC on the PDB14189 benchmark dataset. The MCC of our proposed methods has been increased by at least 9.1% compared to other advanced prediction models. Moreover, the PDBP-Fusion also gets superior performance and model robustness on the PDB2272 independent dataset. It demonstrates that the PDBP-Fusion can be used to predict DBPs from sequences accurately and effectively; the online server is at http://119.45.144.26:8080/PDBP-Fusion/.
【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100006094ZK.pdf 1739KB PDF download
  文献评价指标  
  下载次数:0次 浏览次数:1次