期刊论文详细信息
Algorithms for Molecular Biology
On the number of genomic pacemakers: a geometric approach
Sagi Snir1 
[1] Department of Evolutionary Biology, University of Haifa, Haifa 31905, Israel
关键词: Gap statistics;    Partition distance;    Deming regression;    Genome evolution pacemaker;    Molecular evolution;   
Others  :  1089003
DOI  :  10.1186/s13015-014-0026-0
 received in 2014-10-21, accepted in 2014-11-11,  发布年份 2014
PDF
【 摘 要 】

The universal pacemaker (UPM) model extends the classical molecular clock (MC) model, by allowing each gene, in addition to its individual intrinsic rate as in the MC, to accelerate or decelerate according to the universal pacemaker. Under UPM, the relative evolutionary rates of all genes remain nearly constant whereas the absolute rates can change arbitrarily. It was shown on several taxa groups spanning the entire tree of life that the UPM model describes the evolutionary process better than the MC model. In this work we provide a natural generalization to the UPM model that we denote multiple pacemakers (MPM). Under the MPM model every gene is still affected by a single pacemaker, however the number of pacemakers is not confined to one. Such a model induces a partition over the gene set where all the genes in one part are affected by the same pacemaker and task is to identify the pacemaker partition, or in other words, finding for each gene its associated pacemaker. We devise a novel heuristic procedure, relying on statistical and geometrical tools, to solve the problem and demonstrate by simulation that this approach can cope satisfactorily with considerable noise and realistic problem sizes. We applied this procedure to a set of over 2000 genes in 100 prokaryotes and demonstrated the significant existence of two pacemakers.

【 授权许可】

   
2014 Snir; licensee BioMed Central.

【 预 览 】
附件列表
Files Size Format View
20150123010114529.pdf 1038KB PDF download
Figure 5. 18KB Image download
Figure 4. 20KB Image download
Figure 3. 46KB Image download
Figure 2. 21KB Image download
Figure 1. 9KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

【 参考文献 】
  • [1]Snir S, Wolf YI, Koonin EV: Universal pacemaker of genome evolution. PLoS Comput Biol2012, 8:e1002785,11.
  • [2]Zuckerkandl E: On the molecular evolutionary clock. J Mol Evol 1987, 26(1):34-46.
  • [3]Bromham L, Penny D: The modern molecular clock. Nat Rev Genet 2003, 4(3):216-224. doi:10.1038/nrg1020
  • [4]Linder M, Britton T, Sennblad B: Evaluation of bayesian models of substitution rate evolution parental guidance versus mutual independence. Syst Biol 2011, 60(3):329-342.
  • [5]Allan Drummond D, Wilke CO: Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 2008, 134(2):341-352.
  • [6]Grishin NV, Wolf YI, Koonin EV: From complete genomes to measures of substitution rate variability within and between proteins. Genome Res 2000, 10(7):991-1000. doi:10.1101/gr.10.7.991
  • [7]Wolf YI, Novichkov PS, Karev GP, Koonin EV, Lipman DJ: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc Nat Acad Sci 2009, 106(18):7273-7280.
  • [8]Bromham L: Why do species vary in their rate of molecular evolution? Biol Lett 2009, 5(3):401-404.
  • [9]Snir S, Wolf YI, Koonin EV: Universal pacemaker of genome evolution in animals and fungi and variation of evolutionary rates in diverse organisms. Genome Biol Evol 2014, 6(6):1268-1278.
  • [10]Wolf YI, Snir S, Koonin EV: Stability along with extreme variability in core genome evolution. Genome Biol Evol 2013, 5(7):1393-1402.
  • [11]Ho SYW: The changing face of the molecular evolutionary clock. Trends in Ecol Evol 2014, 29(9):496-503.
  • [12]Duchêne S, Molak M, Ho SYW: Clockstar: choosing the number of relaxed-clock models in molecular phylogenetic analysis. Bioinformatics 2014, 30(7):1017-1019.
  • [13]Ho SYW, Lanfear R: Improved characterisation of among-lineage rate variation in cetacean mitogenomes using codon-partitioned relaxed clocks. Mitochondrial DNA 2010, 21:138-146.
  • [14]Lanfear R, Calcott B, Ho SYW, Guindon S: Partitionfinder: Combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol Biol Evol 2012, 29(6):1695-1701.
  • [15]Wasserman L: All of Statistics. Springer, New York; 2004.
  • [16]Adcock RJ: A problem in least squares. Ann Mathematics 1878, 5:53-54.
  • [17]Deming WE: Statistical Adjustment of Data: John Wiley & Sons; 1943.
  • [18]Fuller WA: Measurement Error Models: John Wiley & Sons; 1987.
  • [19]Moran S, Snir S: Efficient approximation of convex recolorings. J Comput Syst Sci 2007, 73(7):1078-1089.
  • [20]Moran S, Snir S: Convex recolorings of strings and trees: Definitions, hardness results and algorithms. J Comput Syst Sci 2008, 74(5):850-869.
  • [21]Gusfield D: Partition-distance: A problem and class of perfect graphs arising in clustering. Inf Process Lett 2002, 82(3):159-164.
  • [22]Tibshirani R, Walther G, Hastie T: Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc: Ser B (Stat Methodology) 2001, 63(2):411-423.
  • [23]Hartigan JA, Wong MA: A k-means clustering algorithm. Appl Stat 1979, 28:100-108.
  • [24]Lloyd SP: Least squares quantization in pcm. IEEE Trans Inf Theory 1982, 28:129-137.
  • [25]Kruskal JB, Wish M: Multidimensional Scaling. Sage Publications, Beverly Hills and London; 1978.
  • [26]Kruskal JB: Nonmetric multidimensional scaling: a numerical method. Psychometrika 1964, 29:115-130.
  • [27]Guérin R, Orda A: Computing shortest paths for any number of hops. IEEE/ACM Trans Netw 2002, 10(5):613-620.
  • [28]Borg I, Groenen P: Modern Multidimensional Scaling, Theory and Applications. Springer - Verlag, New York; 1997.
  • [29]Lawler EL: Combinatorial Optimization: Networks and Matroids: The University of Michigan; 1976.
  • [30]Ahuja RK, Magnanti TL, Orlin JB: Network Flows. New Jersey, Prentice Hall: Englewood Cliffs; 1993.
  • [31]Tutte WT: Connectivity in Graphs. Mathematical expositions.University of Toronto Press; 1966.
  • [32]Preis R: Linear time 1/2-approximation algorithm for maximum weighted matching in general graphs. In STACS 99, 16th Annual Symposium on Theoretical Aspects of Computer Science, March 4-6, 1999, Proceedings, volume 1563 of Lecture Notes in Computer Science. Springer, Trier, Germany; 1999:259-269.
  • [33]Bar-Yehuda R: One for the price of two: a unified approach for approximating covering problems. Algorithmica 2000, 27:131-144.
  • [34]Mardia KV: Some properties of classical multidimensional scaling. Commun Stat - Theory Methods 1978, A7:123-341.
  • [35]Finden CR, Gordon AD: Obtaining common pruned trees. J Classification 1985, 2:225-276.
  • [36]Puigbo P, Wolf Y, Koonin E: Search for a ’tree of life’ in the thicket of the phylogenetic forest. J Biol2009, 8(6):59.
  • [37]Doolittle WF: Phylogenetic classification and the universal tree. Science 1999, 284(5423):2124-9.
  • [38]Gogarten JP, Doolittle WF, Lawrence JG: Prokaryotic evolution in light of gene transfer. Mol Biol Evol 2002, 19:2226-2238.
  • [39]Snir S: Pacemaker partition identification. In Algorithms in Bioinformatics - 14th International Workshop, WABI 2014, September 8–10, 2014. Proceedings, volume 8701 of Lecture Notes in Computer Science,2014: 281–295. Springer: Wroclaw, Poland; 2014.
  • [40]Bafna V, Berman P, Fujito T: A 2-approximation algorithm for the undirected feedback vertex set problem. SIAM J Discrete Math 1999, 12:289-297.
  • [41]Bar-Yehuda R, Even S: A local-ratio theorem for approximating the weighted vertex cover problem. Ann Discrete Math 1985, 25:27-46.
  文献评价指标  
  下载次数:0次 浏览次数:0次