期刊论文详细信息
BMC Bioinformatics
Rapid forward-in-time simulation at the chromosome and genome level
Alexandros Stamatakis1  Andre J Aberer1 
[1]The Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Schloss-Wolfsbrunnenweg 35, Heidelberg D-69118, Germany
关键词: Natural selection;    Software;    Algorithm;    Fisher-Wright model;    Forward-in-time simulation;    Population genetics;   
Others  :  1087820
DOI  :  10.1186/1471-2105-14-216
 received in 2013-04-07, accepted in 2013-07-03,  发布年份 2013
PDF
【 摘 要 】

Background

In population genetics, simulation is a fundamental tool for analyzing how basic evolutionary forces such as natural selection, recombination, and mutation shape the genetic landscape of a population. Forward simulation represents the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population.

Results

We introduce AnA-FiTS, a highly optimized forward simulation software, that is up to two orders of magnitude faster than current state-of-the-art software. In addition, we present a novel algorithm that further improves runtimes by up to an additional order of magnitude, for simulations where a fraction of the mutations is neutral (e.g., only 10% of mutations have an effect on fitness). Apart from simulated sequences, our tool also generates a graph structure that depicts the complete observable history of neutral mutations.

Conclusions

The substantial performance improvements allow for conducting forward simulations at the chromosome and genome level. The graph structure generated by our algorithm can give rise to novel approaches for visualizing and analyzing the output of forward simulations.

【 授权许可】

   
2013 Aberer and Stamatakis; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117045841218.pdf 1045KB PDF download
Figure 9. 40KB Image download
20140705084603717.pdf 233KB PDF download
Figure 7. 36KB Image download
Figure 6. 31KB Image download
Figure 5. 17KB Image download
Figure 4. 43KB Image download
Figure 3. 21KB Image download
Figure 2. 41KB Image download
Figure 1. 44KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 9.

【 参考文献 】
  • [1]Liu DJ, Leal SM: Replication strategies for rare variant complex trait association studies via next-generation sequencing. Am J Hum Gen 2010, 87(6):790-801. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2997372&tool=pmcentrez&rendertype=abstract webcite]
  • [2]Abecasis GR, Auton A, Brooks LD, DePristo Ma, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean Ga: An integrated map of genetic variation from 1,092 human genomes. Nature 2012, 491(7422):56-65. [http://www.ncbi.nlm.nih.gov/pubmed/23128226 webcite]
  • [3]Pool JE, Hellmann I, Jensen JD, Nielsen R: Population genetic inference from genomic sequence variation. Genome Res 2010, 20(3):291-300. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2840988&tool=pmcentrez&rendertype=abstract webcite]
  • [4]Akey JM, Shriver MD: A grand challenge in evolutionary and population genetics: new paradigms for exploring the past and charting the future in the Post-genomic era. Front Genet 2011, 2(July):47. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3268600&tool=pmcentrez&rendertype=abstract webcite]
  • [5]Enard D, Depaulis F, Roest Crollius H: Human and non-human primate genomes share hotspots of positive selection. PLoS Genet 2010, 6(2):e1000840. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2816677&tool=pmcentrez&rendertype=abstract webcite]
  • [6]Sinha P, Dincer A, Virgil D, Xu G, Poh YP, Jensen JD: On detecting selective sweeps using single genomes. Front Genet 2011, 2(December):85. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3268637&tool=pmcentrez&rendertype=abstract webcite]
  • [7]Li J, Li H, Jakobsson M, Li S, Sjödin P, Lascoux M: Joint analysis of demography and selection in population genetics: where do we stand and where could we go? Mol Ecol 2012, 21:28-44. [http://www.ncbi.nlm.nih.gov/pubmed/21999307 webcite]
  • [8]Zhang W, Balding DJ, Beaumont Ma: Approximate Bayesian computation in population genetics. Genetics 2002, 162(4):2025-2035. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1462356&tool=pmcentrez&rendertype=abstract webcite]
  • [9]Hoban S, Bertorelle G, Gaggiotti OE: Computer simulations: tools for population and evolutionary genetics. Nat Rev Genet 2011, 13(2):110-122. [http://www.ncbi.nlm.nih.gov/pubmed/22230817 webcite]
  • [10]Carvajal-Rodríguez A: Simulation of genes and genomes forward in time. Curr Genomics 2010, 11:58-61. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2851118&tool=pmcentrez&rendertype=abstract webcite]
  • [11]Charlesworth B: Fundamental concepts in genetics: effective population size and patterns of molecular evolution and variation. Nat Rev Genet 2009, 10(3):195-205. [http://www.ncbi.nlm.nih.gov/pubmed/19204717 webcite]
  • [12]Hudson RR: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics (Oxford, England) 2002, 18(2):337-338. [http://www.ncbi.nlm.nih.gov/pubmed/11847089 webcite]
  • [13]Cardin NJ, McVean GaT: Approximating the coalescent with recombination. Philos Trans R Soc Lond B Biol Sci 2005, 360(1459):1387-1393. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1569517&tool=pmcentrez&rendertype=abstract webcite]
  • [14]Spencer CCa, Coop G: SelSim: a program to simulate population genetic data with natural selection and recombination. Bioinformatics (Oxford, England) 2004, 20(18):3673-3675. [http://www.ncbi.nlm.nih.gov/pubmed/15271777 webcite]
  • [15]Ewing G, Hermisson J: MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus. Bioinformatics (Oxford, England) 2010, 26(16):2064-2065. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2916717&tool=pmcentrez&rendertype=abstract webcite]
  • [16]Chadeau-Hyam M, Hoggart CJ, O’Reilly PF, Whittaker JC, De, Iorio M, Balding DJ: Fregene: simulation of realistic sequence-level data in populations and ascertained samples. BMC Bioinformatics 2008, 9:364. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2542380&tool=pmcentrez&rendertype=abstract webcite] BioMed Central Full Text
  • [17]Kim Y, Wiehe T: Simulation of DNA sequence evolution under models of recent directional selection. Brief Bioinform 2009, 10:84-96. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2638626&tool=pmcentrez&rendertype=abstract webcite]
  • [18]Padhukasahasram B, Marjoram P, Wall JD, Bustamante CD, Nordborg M: Exploring population genetic models with recombination using efficient forward-time simulations. Genetics 2008, 178(4):2417-2427. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2323826&tool=pmcentrez&rendertype=abstract webcite]
  • [19]Doroghazi JR, Buckley DH: A model for the effect of homologous recombination on microbial diversification. Genome Biol Evol 2011, 3:1349-1356. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3240962&tool=pmcentrez&rendertype=abstract webcite]
  • [20]Eyre-Walker A, Keightley PD: The distribution of fitness effects of new mutations. Nature Rev Genet 2007, 8(8):610-618. [http://www.ncbi.nlm.nih.gov/pubmed/17637733 webcite]
  • [21]Hernandez RD: A flexible forward simulator for populations subject to selection and demography. Bioinformatics (Oxford, England) 2008, 24(23):2786-2787. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2639268&tool=pmcentrez&rendertype=abstract webcite]
  • [22]Matsumoto M, Nishimura T: Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul 1998, 8:3-30. [http://portal.acm.org/citation.cfm?doid=272991.272995 webcite]
  • [23]Saito M, Matsumoto M: SIMD-oriented fast mersenne twister: a 128-bit pseudorandom number generator. In Monte Carlo and Quasi-Monte Carlo Methods 2006. Berlin, Heidelberg: Springer; 2008:607-622.
  • [24]Karney C[http://randomlib.sourceforge.net/ webcite] 2011
  • [25]Peng B, Kimmel M: simuPOP: a forward-time population genetics simulation environment. Bioinformatics (Oxford, England) 2005, 21(18):3686-3687. [http://www.ncbi.nlm.nih.gov/pubmed/16020469 webcite]
  • [26]Baskins D[http://judy.sourceforge.net/ webcite] 2004
  • [27]Nachman MW, Crowell SL: Estimate of the mutation rate per nucleotide in humans. Genetics 2000, 156:297-304. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1461236&tool=pmcentrez&rendertype=abstract webcite]
  • [28]Jensen-Seaman MI, Furey TS, Payseur Ba, Lu Y, Roskin KM, Chen CF, Haussler D, Jacob HJ, Thomas Ma: Comparative recombination rates in the rat, mouse, and human genomes. Genome Res 2004, 14(4):528-538. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=383296&tool=pmcentrez&rendertype=abstract webcite]
  • [29]Hernandez RD: SFS_Code. 2012. [http://sfscode.sourceforge.net/SFS webcite\_CODE/]
  • [30]Freedman D, Diaconis P: On the histogram as a density estimator:L 2 theory. Probability Theory Relat Fields 1981, 57(4):453-476. [http://www.springerlink.com/index/10.1007/BF01025868 webcite]
  文献评价指标  
  下载次数:72次 浏览次数:43次