Mathematical Biosciences and Engineering | |
Deep neural learning based protein function prediction | |
Lichuan Gu1  Hongwei Zhang2  Hui Wang2  Jun Jiao2  Zihao Zhao2  Minglei Hu2  Chao Wang2  Ning Yang2  Wenjun Xu3  | |
[1] 1. School of Information and Computer, Anhui Agricultural University, Hefei 230036, China 2. Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture, Hefei 230036, China 3. Institute of Intelligent Agriculture, Anhui Agricultural University, Hefei 230036, China 4. School of Life Sciences, Anhui Agricultural University, Hefei 230036, China;1. School of Information and Computer, Anhui Agricultural University, Hefei 230036, China 2. Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture, Hefei 230036, China 3. Institute of Intelligent Agriculture, Anhui Agricultural University, Hefei 230036, China;2. Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture, Hefei 230036, China 3. Institute of Intelligent Agriculture, Anhui Agricultural University, Hefei 230036, China 4. School of Life Sciences, Anhui Agricultural University, Hefei 230036, China; | |
关键词: protein function prediction; protein-protein interaction(ppi); deep neural network(dnn); kernel principal component analysis(kpca); grasshopper optimization algorithm(goa); | |
DOI : 10.3934/mbe.2022114 | |
来源: DOAJ |
【 摘 要 】
It is vital for the annotation of uncharacterized proteins by protein function prediction. At present, Deep Neural Network based protein function prediction is mainly carried out for dataset of small scale proteins or Gene Ontology, and usually explore the relationships between single protein feature and function tags. The practical methods for large-scale multi-features protein prediction still need to be studied in depth. This paper proposes a DNN based protein function prediction approach IGP-DNN. This method uses Grasshopper Optimization Algorithm (GOA) and Intuitionistic Fuzzy c-Means clustering (IFCM) based protein function modules extracting algorithm to extract the features of protein modules, utilizing Kernel Principal Component Analysis (KPCA) method to reduce the dimensionality of the protein attribute information, and integrating module features and attribute features. Inputting integrated data into DNN through multiple hidden layers to classify proteins and predict protein functions. In the experiments, the F-measure value of IGP-DNN on the DIP dataset reaches 0.4436, which shows better performance.
【 授权许可】
Unknown