期刊论文详细信息
BMC Bioinformatics
A unifying model of genome evolution under parsimony
Benedict Paten3  Daniel R Zerbino2  Glenn Hickey3  David Haussler1 
[1] Howard Hughes Medical Institute, Bethesda, MD, USA
[2] European Bioinformatics Institute, Wellcome Trust Genome Campus, CB10 1SD Cambridge, UK
[3] University of California, Santa Cruz, 1156 High St, 95064 Santa Cruz, USA
关键词: Ancestral reconstruction;    Phylogenomics;    Genome rearrangement;   
Others  :  818400
DOI  :  10.1186/1471-2105-15-206
 received in 2013-12-12, accepted in 2014-05-08,  发布年份 2014
PDF
【 摘 要 】

Background

Parsimony and maximum likelihood methods of phylogenetic tree estimation and parsimony methods for genome rearrangements are central to the study of genome evolution yet to date they have largely been pursued in isolation.

Results

We present a data structure called a history graph that offers a practical basis for the analysis of genome evolution. It conceptually simplifies the study of parsimonious evolutionary histories by representing both substitutions and double cut and join (DCJ) rearrangements in the presence of duplications. The problem of constructing parsimonious history graphs thus subsumes related maximum parsimony problems in the fields of phylogenetic reconstruction and genome rearrangement. We show that tractable functions can be used to define upper and lower bounds on the minimum number of substitutions and DCJ rearrangements needed to explain any history graph. These bounds become tight for a special type of unambiguous history graph called an ancestral variation graph (AVG), which constrains in its combinatorial structure the number of operations required. We finally demonstrate that for a given history graph G, a finite set of AVGs describe all parsimonious interpretations of G, and this set can be explored with a few sampling moves.

Conclusion

This theoretical study describes a model in which the inference of genome rearrangements and phylogeny can be unified under parsimony.

【 授权许可】

   
2014 Paten et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140711100944785.pdf 3393KB PDF download
Figure 24. 51KB Image download
Figure 23. 35KB Image download
Figure 22. 44KB Image download
Figure 21. 135KB Image download
Figure 20. 23KB Image download
Figure 19. 125KB Image download
Figure 18. 17KB Image download
Figure 17. 24KB Image download
Figure 16. 13KB Image download
Figure 15. 45KB Image download
Figure 14. 23KB Image download
Figure 13. 31KB Image download
Figure 12. 69KB Image download
Figure 11. 35KB Image download
Figure 10. 30KB Image download
Figure 9. 104KB Image download
Figure 8. 31KB Image download
Figure 7. 33KB Image download
Figure 6. 14KB Image download
Figure 5. 47KB Image download
Figure 4. 39KB Image download
Figure 3. 13KB Image download
Figure 2. 101KB Image download
Figure 1. 32KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

Figure 15.

Figure 16.

Figure 17.

Figure 18.

Figure 19.

Figure 20.

Figure 21.

Figure 22.

Figure 23.

Figure 24.

【 参考文献 】
  • [1]Elias I: Settling the intractability of multiple alignment. J Comput Biol 2006, 13(7):1323-1339.
  • [2]Miller W, Rosenbloom K, Hardison RC, Hou M, Taylor J, Raney B, Burhans R, King DC, Baertsch R, Blankenberg D, Kosakovsky Pond SL, Nekrutenko A, Giardine B, Harris RS, Tyekucheva S, Diekhans M, Pringle TH, Murphy WJ, Lesk S, Weinstock GM, Lindblad-Toh K, Gibbs RA, Lander ES, Siepel A, Haussler D, Kent WJ: 28-way vertebrate alignment and conservation track in the UCSC Genome browser. Genes Dev 2007, 17(12):1797-1808.
  • [3]Darling AE, Mau B, Perna NT: Progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PloS one 2010, 5(6):e11147.
  • [4]Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D: Cactus: algorithms for genome multiple sequence alignment. Genome Res 2011, 21(9):1512-1528.
  • [5]Felsenstein J: Inferring Phylogenies. Sinauer Associates: Sunderland; 2004.
  • [6]Blanchette M, Green ED, Miller W, Haussler D: Reconstructing large regions of an ancestral mammalian genome in silico. Genome Res 2004, 14(12):2412-2423.
  • [7]Kim J, Sinha S: Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment. Bioinformatics (Oxford, England) 2007, 23(3):289-297.
  • [8]Paten B, Herrero J, Fitzgerald S, Beal K, Flicek P, Holmes I, Birney E: Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res 2008, 18(11):1829-1843.
  • [9]Day W: Computational complexity of inferring phylogenies from dissimilarity matrices. Bull Math Biol 1987, 49(4):461-467.
  • [10]Chindelevitch L, Li Z, Blais E, Blanchette M: On the inference of parsimonious indel evolutionary scenarios. J Bioinform Comput Biol 2006, 4(3):721-744.
  • [11]Song YS, Hein J: Constructing minimal ancestral recombination graphs. J Comput Biol 2005, 12(2):147-169.
  • [12]Westesson O, Holmes I: Accurate detection of recombinant breakpoints in whole-genome alignments. PLoS Comput Biol 2009, 5(3):e1000318.
  • [13]Wang LL, Zhang KK, Zhang LL: Perfect phylogenetic networks with recombination. J Comput Biol 2001, 8(1):69-78.
  • [14]Bergeron A, Mixtacki J, Stoye J: A unifying view of genome rearrangements. Lecture Notes in Bioinformatics 4175:163-173.
  • [15]Alekseyev M, Pevzner P: Multi-break rearrangements and chromosomal evolution. Theor Comput Sci 2008, 395(2–3):193-202.
  • [16]Hannenhalli S, Pevzner PA: Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J ACM 1999, 46(1):1-27.
  • [17]Bergeron A, Mixtacki J, Stoye J: On sorting by translocations. J Comput Biol 2006, 13(2):567-578.
  • [18]Yancopoulos S, Attie O, Friedberg R: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 2005, 21(16):3340-3346.
  • [19]Caprara A: Formulations and hardness of multiple sorting by reversals. Proc. 3rd Conf. Computational Molecular Biology RECOMB99 1999, 1:84-93.
  • [20]Xu AW: A fast and exact algorithm for the median of three problem: a graph decomposition approach. J Comput Biol 2009, 16(10):1369-1381.
  • [21]Bourque G, Pevzner PA: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res 2002, 12(1):26-36.
  • [22]Ma J, Ratan A, Raney BJ, Suh BB, Miller W, Haussler D: The infinite sites model of genome evolution. Proc Natl Acad Sci USA 2008, 105(38):14254-14261.
  • [23]El-Mabrouk N: Sorting signed permutations by reversals and insertions/deletions of contiguous segments. J Discrete Algorithm 2000, 1(1):105-121.
  • [24]Yancopoulos S, Friedberg R: DCJ path formulation for genome transformations which include insertions, deletions, and duplications. J Comput Biol 2009, 16(10):1311-1338.
  • [25]Braga MD, Willing E, Stoye J: Double cut and join with insertions and deletions. J Comput Biol 2011, 18(9):1167-1184.
  • [26]El-Mabrouk N, Sankoff D: Analysis of gene order evolution beyond single-copy genes. Methods Mol Biol 2012, 855:397-429.
  • [27]Chauve C, El-Mabrouk N, Gueguen L, Semeria M, Tannier E: Models and Algorithms for Genome Evolution. London: Springer-Verlag; 2013.
  • [28]Bader M: Genome rearrangements with duplications. BMC Bioinformatics 2010, 11(Suppl 1):S27-S27.
  • [29]Shao M, Lin Y: Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion. BMC Bioinformatics 2012, 13(Suppl 19):S13.
  • [30]Raphael B, Zhi D, Tang H, Pevzner P: A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res 2004, 14(11):2336-2346.
  • [31]Paten B, Diekhans M, Earl D, John JS, Ma J, Suh B, Haussler D: Cactus graphs for genome comparisons. J Comput Biol 2011, 18(3):469-481.
  • [32]Medvedev P, Brudno M: Maximum likelihood genome assembly. J Comput Biol 2009, 16(8):1101-1116.
  • [33]Edmonds J, Johnson EL: Matching: A Well-Solved Class of Integer Linear Programs. 1970, 27-30.
  • [34]Tannier E, Zheng C, Sankoff D: Multichromosomal median and halving problems under different genomic distances. BMC Bioinformatics 2009, 10:120.
  • [35]Bienstock D, Langston MA: Algorithmic implications of the graph minor theorem. Handbooks in Operations Research and Management Science 1994, 7:481-502.
  文献评价指标  
  下载次数:157次 浏览次数:2次