| BMC Bioinformatics | |
| A unifying model of genome evolution under parsimony | |
| Benedict Paten3  Daniel R Zerbino2  Glenn Hickey3  David Haussler1  | |
| [1] Howard Hughes Medical Institute, Bethesda, MD, USA | |
| [2] European Bioinformatics Institute, Wellcome Trust Genome Campus, CB10 1SD Cambridge, UK | |
| [3] University of California, Santa Cruz, 1156 High St, 95064 Santa Cruz, USA | |
| 关键词: Ancestral reconstruction; Phylogenomics; Genome rearrangement; | |
| Others : 818400 DOI : 10.1186/1471-2105-15-206 |
|
| received in 2013-12-12, accepted in 2014-05-08, 发布年份 2014 | |
PDF
|
|
【 摘 要 】
Background
Parsimony and maximum likelihood methods of phylogenetic tree estimation and parsimony methods for genome rearrangements are central to the study of genome evolution yet to date they have largely been pursued in isolation.
Results
We present a data structure called a history graph that offers a practical basis for the analysis of genome evolution. It conceptually simplifies the study of parsimonious evolutionary histories by representing both substitutions and double cut and join (DCJ) rearrangements in the presence of duplications. The problem of constructing parsimonious history graphs thus subsumes related maximum parsimony problems in the fields of phylogenetic reconstruction and genome rearrangement. We show that tractable functions can be used to define upper and lower bounds on the minimum number of substitutions and DCJ rearrangements needed to explain any history graph. These bounds become tight for a special type of unambiguous history graph called an ancestral variation graph (AVG), which constrains in its combinatorial structure the number of operations required. We finally demonstrate that for a given history graph G, a finite set of AVGs describe all parsimonious interpretations of G, and this set can be explored with a few sampling moves.
Conclusion
This theoretical study describes a model in which the inference of genome rearrangements and phylogeny can be unified under parsimony.
【 授权许可】
2014 Paten et al.; licensee BioMed Central Ltd.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 20140711100944785.pdf | 3393KB | ||
| Figure 24. | 51KB | Image | |
| Figure 23. | 35KB | Image | |
| Figure 22. | 44KB | Image | |
| Figure 21. | 135KB | Image | |
| Figure 20. | 23KB | Image | |
| Figure 19. | 125KB | Image | |
| Figure 18. | 17KB | Image | |
| Figure 17. | 24KB | Image | |
| Figure 16. | 13KB | Image | |
| Figure 15. | 45KB | Image | |
| Figure 14. | 23KB | Image | |
| Figure 13. | 31KB | Image | |
| Figure 12. | 69KB | Image | |
| Figure 11. | 35KB | Image | |
| Figure 10. | 30KB | Image | |
| Figure 9. | 104KB | Image | |
| Figure 8. | 31KB | Image | |
| Figure 7. | 33KB | Image | |
| Figure 6. | 14KB | Image | |
| Figure 5. | 47KB | Image | |
| Figure 4. | 39KB | Image | |
| Figure 3. | 13KB | Image | |
| Figure 2. | 101KB | Image | |
| Figure 1. | 32KB | Image |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
Figure 11.
Figure 12.
Figure 13.
Figure 14.
Figure 15.
Figure 16.
Figure 17.
Figure 18.
Figure 19.
Figure 20.
Figure 21.
Figure 22.
Figure 23.
Figure 24.
【 参考文献 】
- [1]Elias I: Settling the intractability of multiple alignment. J Comput Biol 2006, 13(7):1323-1339.
- [2]Miller W, Rosenbloom K, Hardison RC, Hou M, Taylor J, Raney B, Burhans R, King DC, Baertsch R, Blankenberg D, Kosakovsky Pond SL, Nekrutenko A, Giardine B, Harris RS, Tyekucheva S, Diekhans M, Pringle TH, Murphy WJ, Lesk S, Weinstock GM, Lindblad-Toh K, Gibbs RA, Lander ES, Siepel A, Haussler D, Kent WJ: 28-way vertebrate alignment and conservation track in the UCSC Genome browser. Genes Dev 2007, 17(12):1797-1808.
- [3]Darling AE, Mau B, Perna NT: Progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PloS one 2010, 5(6):e11147.
- [4]Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D: Cactus: algorithms for genome multiple sequence alignment. Genome Res 2011, 21(9):1512-1528.
- [5]Felsenstein J: Inferring Phylogenies. Sinauer Associates: Sunderland; 2004.
- [6]Blanchette M, Green ED, Miller W, Haussler D: Reconstructing large regions of an ancestral mammalian genome in silico. Genome Res 2004, 14(12):2412-2423.
- [7]Kim J, Sinha S: Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment. Bioinformatics (Oxford, England) 2007, 23(3):289-297.
- [8]Paten B, Herrero J, Fitzgerald S, Beal K, Flicek P, Holmes I, Birney E: Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res 2008, 18(11):1829-1843.
- [9]Day W: Computational complexity of inferring phylogenies from dissimilarity matrices. Bull Math Biol 1987, 49(4):461-467.
- [10]Chindelevitch L, Li Z, Blais E, Blanchette M: On the inference of parsimonious indel evolutionary scenarios. J Bioinform Comput Biol 2006, 4(3):721-744.
- [11]Song YS, Hein J: Constructing minimal ancestral recombination graphs. J Comput Biol 2005, 12(2):147-169.
- [12]Westesson O, Holmes I: Accurate detection of recombinant breakpoints in whole-genome alignments. PLoS Comput Biol 2009, 5(3):e1000318.
- [13]Wang LL, Zhang KK, Zhang LL: Perfect phylogenetic networks with recombination. J Comput Biol 2001, 8(1):69-78.
- [14]Bergeron A, Mixtacki J, Stoye J: A unifying view of genome rearrangements. Lecture Notes in Bioinformatics 4175:163-173.
- [15]Alekseyev M, Pevzner P: Multi-break rearrangements and chromosomal evolution. Theor Comput Sci 2008, 395(2–3):193-202.
- [16]Hannenhalli S, Pevzner PA: Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J ACM 1999, 46(1):1-27.
- [17]Bergeron A, Mixtacki J, Stoye J: On sorting by translocations. J Comput Biol 2006, 13(2):567-578.
- [18]Yancopoulos S, Attie O, Friedberg R: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 2005, 21(16):3340-3346.
- [19]Caprara A: Formulations and hardness of multiple sorting by reversals. Proc. 3rd Conf. Computational Molecular Biology RECOMB99 1999, 1:84-93.
- [20]Xu AW: A fast and exact algorithm for the median of three problem: a graph decomposition approach. J Comput Biol 2009, 16(10):1369-1381.
- [21]Bourque G, Pevzner PA: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res 2002, 12(1):26-36.
- [22]Ma J, Ratan A, Raney BJ, Suh BB, Miller W, Haussler D: The infinite sites model of genome evolution. Proc Natl Acad Sci USA 2008, 105(38):14254-14261.
- [23]El-Mabrouk N: Sorting signed permutations by reversals and insertions/deletions of contiguous segments. J Discrete Algorithm 2000, 1(1):105-121.
- [24]Yancopoulos S, Friedberg R: DCJ path formulation for genome transformations which include insertions, deletions, and duplications. J Comput Biol 2009, 16(10):1311-1338.
- [25]Braga MD, Willing E, Stoye J: Double cut and join with insertions and deletions. J Comput Biol 2011, 18(9):1167-1184.
- [26]El-Mabrouk N, Sankoff D: Analysis of gene order evolution beyond single-copy genes. Methods Mol Biol 2012, 855:397-429.
- [27]Chauve C, El-Mabrouk N, Gueguen L, Semeria M, Tannier E: Models and Algorithms for Genome Evolution. London: Springer-Verlag; 2013.
- [28]Bader M: Genome rearrangements with duplications. BMC Bioinformatics 2010, 11(Suppl 1):S27-S27.
- [29]Shao M, Lin Y: Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion. BMC Bioinformatics 2012, 13(Suppl 19):S13.
- [30]Raphael B, Zhi D, Tang H, Pevzner P: A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res 2004, 14(11):2336-2346.
- [31]Paten B, Diekhans M, Earl D, John JS, Ma J, Suh B, Haussler D: Cactus graphs for genome comparisons. J Comput Biol 2011, 18(3):469-481.
- [32]Medvedev P, Brudno M: Maximum likelihood genome assembly. J Comput Biol 2009, 16(8):1101-1116.
- [33]Edmonds J, Johnson EL: Matching: A Well-Solved Class of Integer Linear Programs. 1970, 27-30.
- [34]Tannier E, Zheng C, Sankoff D: Multichromosomal median and halving problems under different genomic distances. BMC Bioinformatics 2009, 10:120.
- [35]Bienstock D, Langston MA: Algorithmic implications of the graph minor theorem. Handbooks in Operations Research and Management Science 1994, 7:481-502.
PDF