期刊论文详细信息
BMC Genomics
WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation
Research
Piero Fariselli1  Russ B Altman2  Emidio Capriotti3  Rita Casadio4  Pier Luigi Martelli4  Remo Calabrese5 
[1] Department of Computer Science, University of Bologna, 40126, Bologna, Italy;Departments of Bioengineering and Genetics, Stanford University, Stanford, CA, USA;Division of Informatics, Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, USA;Laboratory of Biocomputing, Department of Biology, University of Bologna, 40126, Bologna, Italy;S-IN Soluzioni Informatiche Srl, 36100, Vicenza, Italy;
关键词: Support Vector Machine;    Gene Ontology;    Protein Data Bank;    Reliability Index;    Sequence Profile;   
DOI  :  10.1186/1471-2164-14-S3-S6
来源: Springer
PDF
【 摘 要 】

BackgroundSNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases.ResultsThe server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively.ConclusionsWS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go.

【 授权许可】

CC BY   
© Capriotti et al.; licensee BioMed Central Ltd. 2013

【 预 览 】
附件列表
Files Size Format View
RO202311101842361ZK.pdf 894KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  文献评价指标  
  下载次数:5次 浏览次数:0次