期刊论文详细信息
BMC Bioinformatics
A greedy alignment-free distance estimator for phylogenetic inference
Research
Sharma V. Thankachan1  Sriram P. Chockalingam2  Srinivas Aluru3  Yongchao Liu4  Ambujam Krishnan5 
[1] Department of Computer Science, University of Central Florida, 32816, Orlando, FL, USA;Institute for Data Engineering and Science, Georgia Institute of Technology, 30332, Atlanta, GA, USA;Institute for Data Engineering and Science, Georgia Institute of Technology, 30332, Atlanta, GA, USA;School of Computational Science and Engineering, Georgia Institute of Technology, 30332, Atlanta, GA, USA;School of Computational Science and Engineering, Georgia Institute of Technology, 30332, Atlanta, GA, USA;School of Electrical Engineering and Computer Science, Louisiana State University, 70703, Baton Rouge, LA, USA;
关键词: Alignment-free methods;    Sequence comparison;    Phylogeny reconstruction;   
DOI  :  10.1186/s12859-017-1658-0
来源: Springer
PDF
【 摘 要 】

BackgroundAlignment-free sequence comparison approaches have been garnering increasing interest in various data- and compute-intensive applications such as phylogenetic inference for large-scale sequences. While k-mer based methods are predominantly used in real applications, the average common substring (ACS) approach is emerging as one of the prominent alignment-free approaches. This ACS approach has been further generalized by some recent work, either greedily or exactly, by allowing a bounded number of mismatches in the common substrings.ResultsWe present ALFRED-G, a greedy alignment-free distance estimator for phylogenetic tree reconstruction based on the concept of the generalized ACS approach. In this algorithm, we have investigated a new heuristic to efficiently compute the lengths of common strings with mismatches allowed, and have further applied this heuristic to phylogeny reconstruction. Performance evaluation using real sequence datasets shows that our heuristic is able to reconstruct comparable, or even more accurate, phylogenetic tree topologies than the kmacs heuristic algorithm at highly competitive speed.ConclusionsALFRED-G is an alignment-free heuristic for evolutionary distance estimation between two biological sequences. This algorithm is implemented in C++ and has been incorporated into our open-source ALFRED software package (http://alurulab.cc.gatech.edu/phylo).

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311109379238ZK.pdf 1733KB PDF download
Fig. 1 104KB Image download
12864_2016_2889_Article_IEq3.gif 1KB Image download
Fig. 1 181KB Image download
12951_2015_155_Article_IEq32.gif 1KB Image download
12951_2015_155_Article_IEq33.gif 1KB Image download
12951_2015_155_Article_IEq34.gif 1KB Image download
Fig. 1 134KB Image download
MediaObjects/12888_2023_5242_MOESM1_ESM.docx 20KB Other download
Fig. 9 7247KB Image download
【 图 表 】

Fig. 9

Fig. 1

12951_2015_155_Article_IEq34.gif

12951_2015_155_Article_IEq33.gif

12951_2015_155_Article_IEq32.gif

Fig. 1

12864_2016_2889_Article_IEq3.gif

Fig. 1

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  文献评价指标  
  下载次数:2次 浏览次数:0次