期刊论文详细信息
BMC Bioinformatics
String kernels for protein sequence comparisons: improved fold recognition
Methodology Article
Saghi Nojoomi1  Patrice Koehl2 
[1] Biotechnology program, University of California, Davis, 1, Shields Avenue, 95616, Davis, CA, USA;Department of Computer Science and Genome Center, 1, Shields Avenue, 95616, Davis, CA, USA;
关键词: Protein sequence;    Kernel;    Alignment free methods;   
DOI  :  10.1186/s12859-017-1560-9
 received in 2016-10-21, accepted in 2017-02-23,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundThe amino acid sequence of a protein is the blueprint from which its structure and ultimately function can be derived. Therefore, sequence comparison methods remain essential for the determination of similarity between proteins. Traditional approaches for comparing two protein sequences begin with strings of letters (amino acids) that represent the sequences, before generating textual alignments between these strings and providing scores for each alignment. When the similitude between the two protein sequences to be compared is low however, the quality of the corresponding sequence alignment is usually poor, leading to poor performance for the recognition of similarity.ResultsIn this study, we develop an alignment free alternative to these methods that is based on the concept of string kernels. Starting from recently proposed kernels on the discrete space of protein sequences (Shen et al, Found. Comput. Math., 2013,14:951-984), we introduce our own version, SeqKernel. Its implementation depends on two parameters, a coefficient that tunes the substitution matrix and the maximum length of k-mers that it includes. We provide an exhaustive analysis of the impacts of these two parameters on the performance of SeqKernel for fold recognition. We show that with the right choice of parameters, use of the SeqKernel similarity measure improves fold recognition compared to the use of traditional alignment-based methods. We illustrate the application of SeqKernel to inferring phylogeny on RNA polymerases and show that it performs as well as methods based on multiple sequence alignments.ConclusionWe have presented and characterized a new alignment free method based on a mathematical kernel for scoring the similarity of protein sequences. We discuss possible improvements of this method, as well as an extension of its applications to other modeling methods that rely on sequence comparison.

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311108608884ZK.pdf 976KB PDF download
12951_2016_246_Article_IEq1.gif 1KB Image download
Fig. 1 531KB Image download
Fig. 4 580KB Image download
Fig. 2 1335KB Image download
Fig. 1 1829KB Image download
Fig. 5 989KB Image download
Fig. 8 1963KB Image download
Fig. 2 178KB Image download
12951_2015_155_Article_IEq74.gif 1KB Image download
12951_2015_155_Article_IEq75.gif 1KB Image download
Fig. 3 2497KB Image download
MediaObjects/12944_2023_1936_MOESM6_ESM.pdf 297KB PDF download
Fig. 7 432KB Image download
Fig. 6 1051KB Image download
12951_2016_246_Article_IEq3.gif 1KB Image download
12951_2017_303_Article_IEq1.gif 1KB Image download
Fig. 6 3167KB Image download
MediaObjects/13068_2023_2399_MOESM7_ESM.xlsx 57KB Other download
Fig. 1 562KB Image download
MediaObjects/13068_2023_2416_MOESM2_ESM.xls 32KB Other download
Scheme. 1 8432KB Image download
MediaObjects/13068_2023_2416_MOESM4_ESM.xls 40KB Other download
Fig. 2 265KB Image download
MediaObjects/13068_2023_2416_MOESM5_ESM.xls 44KB Other download
MediaObjects/13068_2023_2416_MOESM6_ESM.xls 54KB Other download
12951_2015_155_Article_IEq76.gif 1KB Image download
MediaObjects/12888_2023_5218_MOESM1_ESM.docx 893KB Other download
12951_2015_155_Article_IEq77.gif 1KB Image download
Fig. 4 603KB Image download
Fig. 1: The conceptual framework for adherence to treatment guidelines in private drug outlets in Kisumu, Kenya 398KB Image download
【 图 表 】

Fig. 1: The conceptual framework for adherence to treatment guidelines in private drug outlets in Kisumu, Kenya

Fig. 4

12951_2015_155_Article_IEq77.gif

12951_2015_155_Article_IEq76.gif

Fig. 2

Scheme. 1

Fig. 1

Fig. 6

12951_2017_303_Article_IEq1.gif

12951_2016_246_Article_IEq3.gif

Fig. 6

Fig. 7

Fig. 3

12951_2015_155_Article_IEq75.gif

12951_2015_155_Article_IEq74.gif

Fig. 2

Fig. 8

Fig. 5

Fig. 1

Fig. 2

Fig. 4

Fig. 1

12951_2016_246_Article_IEq1.gif

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  文献评价指标  
  下载次数:9次 浏览次数:0次