期刊论文详细信息
JOURNAL OF THEORETICAL BIOLOGY 卷:312
Comprehensive comparative analysis and identification of RNA-binding protein domains: Multi-class classification and feature selection
Article
Jahandideh, Samad1  Srinivasasainagendra, Vinodh1  Zhi, Degui1 
[1] Univ Alabama Birmingham, Dept Biostat, Sect Stat Genet, Birmingham, AL 35294 USA
关键词: RNA-binding domain;    Tuned multi-class SVM;    Random Forest;    Multi-class l(1)/l(q)-regularized logistic regression;    Prediction;   
DOI  :  10.1016/j.jtbi.2012.07.013
来源: Elsevier
PDF
【 摘 要 】

RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class l(1)/l(q)-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions. Published by Elsevier Ltd.

【 授权许可】

Free   

【 预 览 】
附件列表
Files Size Format View
10_1016_j_jtbi_2012_07_013.pdf 359KB PDF download
  文献评价指标  
  下载次数:0次 浏览次数:0次