BMC Bioinformatics | |
A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees | |
Leo van Iersel2  Steven Kelk3  Nela Lekić3  Celine Scornavacca1  | |
[1] ISEM, CNRS – Université Montpellier II, Place Eugène Bataillon, 34095 Montpellier, France | |
[2] Centrum Wiskunde & Informatica (CWI), P.O. Box 94079, 1090 GB, Amsterdam, The Netherlands | |
[3] Department of Knowledge Engineering (DKE), Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands | |
关键词: Directed feedback vertex set; Approximation algorithms; Phylogenetic networks; Hybridization number; | |
Others : 818624 DOI : 10.1186/1471-2105-15-127 |
|
received in 2013-12-02, accepted in 2014-04-24, 发布年份 2014 | |
【 摘 要 】
Background
Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. Even for binary trees, exact solvers struggle to solve instances with reticulation number larger than 40-50.
Results
Here we present CYCLEKILLER and NONBINARYCYCLEKILLER, the first methods to produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations.
Conclusions
Using simulations, we demonstrate that these algorithms run quickly for large and difficult instances, producing solutions that are very close to optimality. As a spin-off from our simulations we also present TERMINUSEST, which is the fastest exact method currently available that can handle nonbinary trees: this is used to measure the accuracy of the NONBINARYCYCLEKILLER algorithm. All three methods are based on extensions of previous theoretical work (SIDMA 26(4):1635-1656, TCBB 10(1):18-25, SIDMA 28(1):49-66) and are publicly available. We also apply our methods to real data.
【 授权许可】
2014 van Iersel et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140711124719582.pdf | 338KB | download | |
Figure 1. | 21KB | Image | download |
【 图 表 】
Figure 1.
【 参考文献 】
- [1]Gascuel O, (ed.): Mathematics of Evolution and Phylogeny. UK: Oxford University Press Inc.; 2005.
- [2]Gascuel O, Steel M, (eds.): Reconstructing Evolution: New Mathematical and Computational Advances. UK: Oxford University Press; 2007.
- [3]Bapteste E, van Iersel LJJ, Janke A, Kelchner S, Kelk SM, McInerney JO, Morrison DA, Nakhleh L, Steel M, Stougie L, Whitfield J: Networks: expanding evolutionary thinking. Trends Genet 2013, 29(8):439-441.
- [4]Huson DH, Rupp R, Scornavacca C: Phylogenetic Networks: Concepts, Algorithms and Applications . UK: Cambridge University Press; 2011.
- [5]Huson DH, Scornavacca C: A survey of combinatorial methods for phylogenetic networks. Genome Biol Evol 2011, 3:23-35.
- [6]Nakhleh L: Evolutionary phylogenetic networks: models and issues. In The Problem Solving Handbook for Computational Biology and Bioinformatics. Edited by Heath L, Ramakrishnan N. Berlin: Springer; 2009.
- [7]Bordewich M, Semple C: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Appl Math 2007, 155(8):914-928.
- [8]Flum J, Grohe M: Parameterized Complexity Theory. Berlin: Springer; 2006.
- [9]Downey RG, Fellows MR: Parameterized Complexity (Monographs in Computer Science). Berlin: Springer; 1999.
- [10]Bordewich M, Linz S, John KS, Semple C: A reduction algorithm for computing the hybridization number of two trees. Evol Bioinform 2007, 3:86-98.
- [11]Chen Z-Z, Wang L: Hybridnet: a tool for constructing hybridization networks. Bioinformatics 2010, 26(22):2912-2913.
- [12]Collins J, Linz S, Semple C: Quantifying hybridization in realistic time. J Comp Biol 2011, 18:1305-1318.
- [13]Whidden C, Beiko RG, Zeh N: Fixed-parameter algorithms for maximum agreement forests. SIAM J Comput 42(4):1431-1466.
- [14]Huson DH, Scornavacca C: Dendroscope 3: An interactive tool for rooted phylogenetic trees and networks. Syst Biol 2012, 61(6):1061-1067.
- [15]Albrecht B, Scornavacca C, Cenci A, Huson DH: Fast computation of minimum hybridization networks. Bioinformatics 2012, 28(2):191-197.
- [16]Chen Z-Z, Wang L: Algorithms for reticulate networks of multiple phylogenetic trees. IEEE/ACM Trans Comput Biol Bioinf 2012, 9(2):372-384.
- [17]Chen Z-Z, Wang L: An ultrafast tool for minimum reticulate networks. J Comput Biol 2013, 20(1):38-41.
- [18]Piovesan T, Kelk S: A simple fixed parameter tractable algorithm for computing the hybridization number of two (not necessarily binary) trees. IEEE/ACM Trans Comput Biol Bioinf 2013, 10(1):18-25.
- [19]Linz S, Semple C: Hybridization in non-binary trees. IEEE/ACM Trans Comput Biol Bioinf 2009, 6(1):30-45.
- [20]Kelk SM, van Iersel LJJ, Lekic N, Linz S, Scornavacca C, Stougie L: Cycle killer...qu’est-ce que c’est? on the comparative approximability of hybridization number and directed feedback vertex set. SIAM J Discr Math 2012, 26(4):1635-1656.
- [21]van Iersel LJJ, Kelk SM, Stougie L, Lekić N: Approximation algorithms for nonbinary agreement forests. SIAM J Discrete Math 2014, 28(1):49-66.
- [22]Whidden C: rSPR. http://kiwi.cs.dal.ca/Software/RSPR webcite
- [23]Whidden C, Beiko RG, Zeh N: Fast FPT algorithms for computing rooted agreement forests: Theory and experiments. Proceedings of the 9th International Symposium on Experimental Algorithms (SEA) Lect Notes Comput Sc, vol. 6049, pp. 141–153 SpringerL: Berlin; 2010
- [24]Whidden C, Beiko RG, Zeh N: Fixed-Parameter and Approximation Algorithms for Maximum Agreement Forests of Multifurcating Trees. ArXiv preprint: http://arxiv.org/abs/1305.0512 webcite (2013)
- [25]Kelk SM: CYCLEKILLER. http://skelk.sdf-eu.org/cyclekiller webcite
- [26]van Iersel LJJ: NONBINARYCYCLEKILLER. http://homepages.cwi.nl/~iersel/cyclekiller webcite
- [27]Kelk SM: TERMINUSEST. http://skelk.sdf-eu.org/terminusest webcite
- [28]van Iersel LJJ, Kelk SM, Lekic N, Scornavacca C: A practical approximation algorithm for solving massive instances of hybridization number. In Algorithms in Bioinformatics. Lect Notes Comput Sc, vol. 7534, pp. 430–440. . Edited by Raphael B, Tang J. Berlin: Springer; 2012.
- [29]Baroni M, Grünewald S, Moulton V, Semple C: Bounding the number of hybridisation events for a consistent evolutionary history. J Math Biol 2005, 51:171-182.
- [30]Rouard M, Guignon V, Aluome C, Laporte M-A, Droc G, Walde C, Zmasek CM, Périn C, Conte MG: Greenphyldb v2.0: comparative and functional genomics in plants. Nucleic Acids Res 2010. doi:10.1093/nar/gkq811. Epub 2010 Sep 22
- [31]Scornavacca C, Berry V, Ranwez V: Building species trees from larger parts of phylogenomic databases. Inform Comput 2011, 209(3):590-605.
- [32]Scornavacca C: SSIMUL. http://www.atgc-montpellier.fr/ssimul/ webcite
- [33]Even G, Naor J, Schieber B, Sudan M: Approximating minimum feedback sets and multicuts in directed graphs. Algorithmica 1998, 20(2):151-174.