BMC Evolutionary Biology | |
Declining transition/transversion ratios through time reveal limitations to the accuracy of nucleotide substitution models | |
Edward C Holmes2  Simon YW Ho1  Sebastián Duchêne1  | |
[1] School of Biological Sciences, The University of Sydney, Sydney 2006, NSW, Australia;Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, Sydney Medical School, The University of Sydney, Sydney 2006, NSW, Australia | |
关键词: Saturation; Substitution rate; Virus; Substitution model; Transition/transversion ratio; | |
Others : 1158291 DOI : 10.1186/s12862-015-0312-6 |
|
received in 2014-11-09, accepted in 2015-02-19, 发布年份 2015 | |
【 摘 要 】
Background
Genetic analyses of DNA sequences make use of an increasingly complex set of nucleotide substitution models to estimate the divergence between gene sequences. However, there is currently no way to assess the validity of nucleotide substitution models over short time-scales and with limited mutational accumulation.
Results
We show that quantifying the decline in the ratio of transitions to transversions (ti/tv) over time provides an in-built measure of mutational saturation and hence of substitution model accuracy. We tested this through detailed phylogenetic analyses of 10 representative virus data sets comprising recently sampled and closely related sequences. In the majority of cases our estimates of ti/tv decrease with time, even under sophisticated time-reversible models of nucleotide substitution. This indicates that high levels of saturation are attained extremely rapidly in viruses, sometimes within decades. In contrast, we did not find any temporal patterns in selection pressures or CG-content over these short time-frames. To validate the temporal trend of ti/tv across a broader taxonomic range, we analyzed a set of 76 different viruses. Again, the estimate of ti/tv scaled negatively with evolutionary time, a trend that was more pronounced for rapidly-evolving RNA viruses than slowly-evolving DNA viruses.
Conclusions
Our study shows that commonly used substitution models can underestimate the number of substitutions among closely related sequences, such that the time-scale of viral evolution and emergence may be systematically underestimated. In turn, estimates of ti/tv provide an effective internal control of substitution model performance in viruses because of their high sensitivity to mutational saturation.
【 授权许可】
2015 Duchêne et al.; licensee BioMed Central.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150408012616177.pdf | 5106KB | download | |
Figure 3. | 35KB | Image | download |
Figure 2. | 23KB | Image | download |
Figure 1. | 56KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
【 参考文献 】
- [1]Jukes TH, Cantor CR: Evolution of protein molecules. In Mammalian protein metabolism. Edited by Munro H. Academic, New York; 1969:21-132.
- [2]Brown WM, Prager EM, Wang A, Wilson AC: Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol 1982, 18:225-39.
- [3]Lewontin RC: Inferring the number of evolutionary events from DNA coding sequence differences. Mol Biol Evol 1989, 6:15-32.
- [4]Kimura M: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 1980, 16:111-20.
- [5]Begun DJ, Holloway AK, Stevens K, Hillier LW, Poh Y-P, Hahn MW, et al.: Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol 2007, 5:e310.
- [6]Hodgkinson A, Eyre-Walker A: Human triallelic sites: evidence for a new mutational mechanism? Genetics 2010, 184:233-41.
- [7]Meyer S, Weiss G, von Haeseler A: Pattern of nucleotide substitution and rate heterogeneity in the hypervariable regions I and II of human mtDNA. Genetics 1999, 152:1103-10.
- [8]Tamura K, Nei M: Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 1993, 10:512-26.
- [9]Purvis A, Bromham L: Estimating the transition/transversion ratio from independent pairwise comparisons with an assumed phylogeny. J Mol Evol 1997, 44:112-9.
- [10]Drummond AJ, Pybus OG, Rambaut A, Forsberg R, Rodrigo AG: Measurably evolving populations. Trends Ecol Evol 2003, 18:481-8.
- [11]Rambaut A: Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 2000, 16:395-9.
- [12]Duchêne S, Holmes EC, Ho SYW: Analyses of evolutionary dynamics in viruses are hindered by a time-dependent bias in rate estimates. Proc R Soc London B 2014, 281:20140732.
- [13]Leitner T, Kumar S, Albert J: Tempo and mode of nucleotide substitutions in gag and env gene fragments in human immunodeficiency virus type 1 populations with a known transmission history. J Virol 1997, 71:4761-70.
- [14]Holmes EC: The evolution and emergence of RNA viruses. Oxford University Press, USA; 2009.
- [15]Holmes EC: Molecular clocks and the puzzle of RNA virus origins. J Virol 2003, 77:3893-7.
- [16]Jia F, Lo N, Ho SYW: The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales. PLoS ONE 2014, 9:e95722.
- [17]Wakeley J: Substitution-rate variation among sites and the estimation of transition bias. Mol Biol Evol 1994, 11:436-42.
- [18]Li W-H, Wu C-I, Luo C-C: A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol Biol Evol 1985, 2:150-74.
- [19]Yang Z, Bielawski JP: Statistical methods for detecting molecular adaptation. Trends Ecol Evol 2000, 15:496-503.
- [20]Kryazhimskiy S, Plotkin JB: The population genetics of dN/dS. PLoS Genet 2008, 4:e1000304.
- [21]Dos Reis M, Yang Z: Why do more divergent sequences produce smaller nonsynonymous/synonymous rate ratios in pairwise sequence comparisons? Genetics 2013, 195:195-204.
- [22]Belle E, Piganeau G, Gardner M, Eyre-Walker A: An investigation of the variation in the transition bias among various animal mitochondrial DNA. Gene 2005, 355:58-66.
- [23]Bulmer M: Neighboring base effects on substitution rates in pseudogenes. Mol Biol Evol 1986, 3:322-9.
- [24]Gojobori T, Moriyama EN, Kimura M: Molecular clock of viral evolution, and the neutral theory. Proc Natl Acad Sci U S A 1990, 87:10015-8.
- [25]Hodgkinson A, Eyre-Walker A: Variation in the mutation rate across mammalian genomes. Nat Rev Genet 2011, 12:756-66.
- [26]Siepel A, Haussler D: Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 2004, 21:468-88.
- [27]Rosenberg MS, Subramanian S, Kumar S: Patterns of transitional mutation biases within and among mammalian genomes. Mol Biol Evol 2003, 20:988-93.
- [28]Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32:1792-7.
- [29]Drummond AJ, Suchard MA, Xie D, Rambaut A: Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 2012, 29:1969-73.
- [30]Keane TM, Creevey CJ, Pentony MM, Naughton TJ, Mclnerney JO: Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol 2006, 6:29. BioMed Central Full Text
- [31]Luo A, Qiao H, Zhang Y, Shi W, Ho SYW, Xu W, et al.: Performance of criteria for selecting evolutionary models in phylogenetics: a comprehensive study based on simulated datasets. BMC Evol Biol 2010, 10:242. BioMed Central Full Text
- [32]Drummond AJ, Rambaut A, Shapiro B, Pybus OG: Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol 2005, 22:1185-92.
- [33]Plummer M, Best N, Cowles K, Vines K: CODA: Convergence diagnosis and output analysis for MCMC. R News 2006, 6:7-11.
- [34]Ramsden C, Melo FL, Figueiredo LM, Holmes EC, Zanotto PMA: High rates of molecular evolution in hantaviruses. Mol Biol Evol 2008, 25:1488-92.
- [35]Firth C, Kitchen A, Shapiro B, Suchard MA, Holmes EC, Rambaut A: Using time-structured data to estimate evolutionary rates of double-stranded DNA viruses. Mol Biol Evol 2010, 27:2038-51.
- [36]Ramsden C, Holmes EC, Charleston MA: Hantavirus evolution in relation to its rodent and insectivore hosts: no evidence for codivergence. Mol Biol Evol 2009, 26:143-53.
- [37]Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 2007, 24:1586-91.
- [38]Zwickl DJ. GARLI, vers. 0.951. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence data sets under the maximum likelihood criterion. Ph. D. dissertation, University of Texas, Austin, Texas, USA; 2006.
- [39]Schliep KP: Phangorn: Phylogenetic analysis in R. Bioinformatics 2011, 27:592-3.
- [40]Bollback JP: Bayesian model adequacy and choice in phylogenetics. Mol Biol Evol 2002, 19:1171-80.
- [41]Felsenstein J: Cases in which parsimony or compatibility methods will be positively misleading. Syst Biol 1978, 27:401-10.
- [42]Gaut BS, Lewis PO: Success of maximum likelihood phylogeny inference in the four-taxon case. Mol Biol Evol 1995, 12:152-62.
- [43]Ripplinger J, Sullivan J: Assessment of substitution model adequacy using frequentist and Bayesian methods. Mol Biol Evol 2010, 27:2790-803.
- [44]Sullivan J, Joyce P: Model selection in phylogenetics. Annu Rev Ecol Evol Syst 2005, 36:445-66.
- [45]Arbogast BS, Edwards SV, Wakeley J, Beerli P, Slowinski JB: Estimating divergence times from molecular data on phylogenetic and population genetic timescales. Annu Rev Ecol Syst 2002, 33:707-40.
- [46]Phillips MJ: Branch-length estimation bias misleads molecular dating for a vertebrate mitochondrial phylogeny. Gene 2009, 441:132-40.
- [47]Bloom JD: An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol Biol Evol 2014, 31:1956-78.