Algorithms for Molecular Biology | |
On the group theoretical background of assigning stepwise mutations onto phylogenies | |
Mareike Fischer2  Steffen Klaere1  Minh Anh Thi Nguyen2  Arndt von Haeseler2  | |
[1] Department of Statistics and School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand | |
[2] , Center for Integrative Bioinformatics ViennaMax F. Perutz Laboratories, University of Vienna, Medical University of Vienna, University of Veterinary Medicine Vienna, Dr. Bohr Gasse 9, A-1030, Vienna, Austria | |
关键词: Group theory; Tree reconstruction; Substitution model; Maximum parsimony; Maximum likelihood; | |
Others : 794807 DOI : 10.1186/1748-7188-7-36 |
|
received in 2011-10-17, accepted in 2012-12-10, 发布年份 2012 | |
【 摘 要 】
Recently one step mutation matrices were introduced to model the impact of substitutions on arbitrary branches of a phylogenetic tree on an alignment site. This concept works nicely for the four-state nucleotide alphabet and provides an efficient procedure conjectured to compute the minimal number of substitutions needed to transform one alignment site into another. The present paper delivers a proof of the validity of this algorithm. Moreover, we provide several mathematical insights into the generalization of the OSM matrix to multi-state alphabets. The construction of the OSM matrix is only possible if the matrices representing the substitution types acting on the character states and the identity matrix form a commutative group with respect to matrix multiplication. We illustrate this approach by looking at Abelian groups over twenty states and critically discuss their biological usefulness when investigating amino acids.
【 授权许可】
2012 Fischer et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140705073137423.pdf | 1411KB | download | |
Figure 4. | 83KB | Image | download |
Figure 4. | 83KB | Image | download |
Figure 4. | 83KB | Image | download |
Figure 4. | 83KB | Image | download |
Figure 4. | 83KB | Image | download |
Figure 3. | 31KB | Image | download |
Figure 3. | 31KB | Image | download |
Figure 3. | 31KB | Image | download |
Figure 3. | 31KB | Image | download |
Figure 2. | 31KB | Image | download |
Figure 1. | 197KB | Image | download |
Figure 1. | 197KB | Image | download |
Figure 1. | 197KB | Image | download |
【 图 表 】
Figure 1.
Figure 1.
Figure 1.
Figure 2.
Figure 3.
Figure 3.
Figure 3.
Figure 3.
Figure 4.
Figure 4.
Figure 4.
Figure 4.
Figure 4.
【 参考文献 】
- [1]Durbin R, Eddy SR, Krogh A, Mitchison G: Biological sequence analysis - Probabilistic models of proteins and nucleic acids. Cambridge: Cambridge University Press; 1998.
- [2]Mount DW: Bioinformatics: Sequence and Genome Analysis. New York: Cold Spring Harbor; 2004.
- [3]Semple C, Steel M: Phylogenetics. New York: Oxford University Press; 2003.
- [4]Nguyen MAT, Klaere S, von Haeseler A: MISFITS: evaluating the goodness of fit between a phylogenetic model and an alignment. Mol Biol Evol 2011, 28:143-152.
- [5]Klaere S, Gesell T, von Haeseler A: The impact of single substitutions on multiple sequence alignments. Philos T R Soc B 2008, 363:4041-4047.
- [6]Kimura M: Estimation of Evolutionary Distances between Homologous Nucleotide Sequences. P Natl Acad Sci USA 1981, 78:454-458.
- [7]Fitch WM: Toward defining the course of evolution: Minimum change for a specific tree topology. Syst Zool 1971, 20:406-416.
- [8]Humphreys JF: A course in group theory. New York: Oxford University Press; 1996.
- [9]Hendy M, Penny D, Steel M: A discrete Fourier analysis for evolutionary trees. P Natl Acad Sci USA 1994, 91:3339-3343.
- [10]Bashford JD, Jarvis PD, Sumner JG, Steel MA: U(1)×U(1)×U(1) symmetry of the Kimura 3ST model and phylogenetic branching processes. J Phys A: Math Gen 2004, 37:L81—L89.
- [11]Bryant D: Hadamard Phylogenetic Methods and the n-taxon process. Bull Math Biol 2009, 71(2):339-351.
- [12]Hendy MD, Penny D: A framework for the quantitative study of evolutionary trees. Syst Zool 1989, 38(4):297-309.
- [13]MacLane S, Birkhoff G: Algebra. Chelsea: American Mathematical Society; 1999.
- [14]Bryant D: Extending Tree Models to Split Networks. In Algebraic Statistics for Computational Biology. Edited by Pachter L, Sturmfels B. Cambridge: Cambridge University Press; 2005.
- [15]Kimura M: A Simple Method for Estimating Evolutionary Rates of Base Substitutions through Comparative Studies of Nucleotide Sequences. J Mol Evol 1980, 16:111-120.
- [16]Horn RA, Johnson CR: Topics in matrix analysis. New York: Oxford University Press; 1991.
- [17]Steeb WH, Hardy Y: Matrix Calculus and Kronecker Product: A Practical Approach to Linear and Multilinear Algebra. Singapore: World Scientific Publishing; 2011.
- [18]Cayley A: Desiderata and Suggestions: No. 2. The Theory of Groups: Graphical Representation. Am J Math 1878, 1(2):174-176.
- [19]Nguyen MAT, Gesell T, von Haeseler A: ImOSM: Intermittent Evolution and Robustness of Phylogenetic Methods. Mol Biol Evol 2012, 29(2):663-673.
- [20]Horn RA, Johnson CR: Matrix analysis. Cambridge: Cambridge University Press; 1990.
- [21]Tamura K, Nei M: Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 1993, 10(3):512-526.
- [22]Sumner JG, Holland BR, Jarvis PD: The Algebra of the General Markov Model on Phylogenetic Trees and Networks. Bull Math Biol 2012, 74(4):858-880.
- [23]Sumner JG, Jarvis PD, Fernandez-Sanchez J, Kaine B, Woodhams M, Holland BR: Is the general time-reversible model bad for molecular phylogenetics? Syst Biol 2012, 61(6):1069-1074.
- [24]Keilen T: Endliche Gruppen. Eine Einführung mit dem Ziel der Klassifikation von Gruppen kleiner Ordnung. 2000. [http://www.mathematik.uni-kl.de/wwwagag/download/scripts/Endliche.Gruppen.pdf webcite]
- [25]Kosiol C, Goldman N: Different Versions of the Dayhoff Rate Matrix. Mol Biol Evol 2005, 22(2):193-199.
- [26]Susko E, Roger AJ: On Reduced Amino Acid Alphabets for Phylogenetic Inference. Mol Biol Evol 2007, 24(9):2139-2150.