期刊论文详细信息
BMC Genomics
Impacts of low coverage depths and post-mortem DNA damage on variant calling: a simulation study
David Lambert1  Matthew Parks1 
[1] Environmental Futures Research Institute, Griffith University, Nathan 4111, Queensland, Australia
关键词: Coverage depth;    Next-generation sequencing;    Variant calling;    Ancient DNA;   
Others  :  1109791
DOI  :  10.1186/s12864-015-1219-8
 received in 2014-09-16, accepted in 2015-01-02,  发布年份 2015
PDF
【 摘 要 】

Background

Massively parallel sequencing platforms, featuring high throughput and relatively short read lengths, are well suited to ancient DNA (aDNA) studies. Variant identification from short-read alignment could be hindered, however, by low DNA concentrations common to historic samples, which constrain sequencing depths, and post-mortem DNA damage patterns.

Results

We simulated pairs of sequences to act as reference and sample genomes at varied GC contents and divergence levels. Short-read sequence pools were generated from sample sequences, and subjected to varying levels of “post-mortem” damage by adjusting levels of fragmentation and fragmentation biases, transition rates at sequence ends, and sequencing depths. Mapping of sample read pools to reference sequences revealed several trends, including decreased alignment success with increased read length and decreased variant recovery with increased divergence. Variants were generally called with high accuracy, however identification of SNPs (single-nucleotide polymorphisms) was less accurate for high damage/low divergence samples. Modest increases in sequencing depth resulted in rapid gains in total variant recovery, and limited improvements to recovery of heterozygous variants.

Conclusions

This in silico study suggests aDNA-associated damage patterns minimally impact variant call accuracy and recovery from short-read alignment, while modest increases in sequencing depth can greatly improve variant recovery.

【 授权许可】

   
2015 Parks and Lambert; licensee Biomed Central.

【 预 览 】
附件列表
Files Size Format View
20150203023146390.pdf 730KB PDF download
Figure 4. 35KB Image download
Figure 3. 32KB Image download
Figure 2. 24KB Image download
Figure 1. 24KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

【 参考文献 】
  • [1]Parks M, Subramanian S, Baroni C, Salvatore MC, Zhang G, Millar CD, Lambert DM: Ancient population genomics and the study of evolution. Philos Trans R Soc London Ser B, doi:10.1098/rstb.2013.0381.
  • [2]Shapiro B, Hofreiter M: A paleogenomic perspective on evolution and gene function: new insights from ancient DNA. Science 2014, 343:6169.
  • [3]Pääbo S, Poinar H, Serre D, Jaenicke-Després V, Hebler J, Rohland N, et al.: Genetic analyses from ancient DNA. Annu Rev Genet 2004, 38(1):645-79.
  • [4]Rizzi E, Lari M, Gigli E, De Bellis G, Caramelli D: Ancient DNA studies: new perspectives on old samples. Gen Sel Evol 2013, 2013(45):4. BioMed Central Full Text
  • [5]Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, et al.: A Draft sequence of the Neandertal genome. Science 2010, 328(5979):710-22.
  • [6]Miller W, Drautz DI, Ratan A, Pusey B, Qi J, Lesk AM, et al.: Sequencing the nuclear genome of the extinct woolly mammoth. Nature 2008, 456(7220):387-90.
  • [7]Noonan JP, Hofreiter M, Smith D, Priest JR, Rohland N, Rabeder G, et al.: Genomic sequencing of Pleistocene cave bears. Science 2005, 309(5734):597-9.
  • [8]Rasmussen M, Li Y, Lindgreen S, Pedersen JS, Albrechtsen A, Moltke I, et al.: Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 2010, 463(7282):757-62.
  • [9]Rohland N, Reich D, Mallick S, Meyer M, Green RE, Georgiadis NJ, et al.: Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and Savanna elephants. PLOS Biol 2010, 8:e1000564.
  • [10]Keller A, Graefen A, Ball M, Matzas M, Boisguerin V, Maixner F, et al.: New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat Commun 2012, 3:698.
  • [11]Meyer M, Kircher M, Gansauge M-T, Li H, Racimo F, Mallick S, et al.: A high-coverage genome sequence from an Archaic Denisovan individual. Science 2012, 338(6104):222-6.
  • [12]Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M, et al.: Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 2013, 499(7456):74-8.
  • [13]Millar CD, Huynen L, Subramanian S, Mohandesan E, Lambert DM: New developments in ancient genomics. Trends Ecol Evol 2008, 23(7):386-93.
  • [14]Overballe-Petersen S, Orlando L, Willerslev E: Next-generation sequencing offers new insights into DNA degradation. Trends Biotechnol 2012, 30(7):364-8.
  • [15]Brotherton P, Endicott P, Sanchez JJ, Beaumont M, Barnett R, Austin J, et al.: Novel high-resolution characterization of ancient DNA reveals C > U-type base modification events as the sole cause of post mortem miscoding lesions. Nucleic Acids Res 2007, 35(17):5717-28.
  • [16]Prufer K, Stenzel U, Hofreiter M, Paabo S, Kelso J, Green RE: Computational challenges in the analysis of ancient DNA. Genome Biol 2010, 11:R47. BioMed Central Full Text
  • [17]Le SQ, Durbin R: SNP detection and genotyping from low-coverage sequencing data on multiple diploid samples. Genome Res 2011, 21(6):952-60.
  • [18]Ginolhac A, Rasmussen M, Gilbert MTP, Willerslev E, Orlando L: mapDamage: testing for damage patterns in ancient DNA sequences. Bioinformatics 2011, 27(15):2153-5.
  • [19]Sánchez-Quinto F, Schroeder H, Ramirez O, Ávila-Arcos María C, Pybus M, Olalde I, et al.: Genomic affinities of Two 7,000-year-Old Iberian hunter-gatherers. Curr Biol 2012, 22(16):1494-9.
  • [20]Knapp M, Horsburgh KA, Prost S, Stanton J-A, Buckley HR, Walter RK, et al.: Complete mitochondrial DNA genome sequences from the first New Zealanders. Proc Acad Nat Sci Phila 2012, 109(45):18350-4.
  • [21]Avila-Arcos MC, Cappellini E, Romero-Navarro JA, Wales N, Moreno-Mayar JV, Rasmussen M, et al.: Application and comparison of large-scale solution-based DNA capture-enrichment methods on ancient DNA. Sci Rep 2011, 1:74. doi:10.1038/srep00074
  • [22]Poinar HN, Schwarz C, Qi J, Shapiro B, MacPhee RDE, Buigues B, et al.: Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA. Science 2006, 311(5759):392-4.
  • [23]Molak M, Ho SW: Evaluating the impact of post-mortem damage in ancient DNA: a theoretical approach. J Mol Evol 2011, 73(3–4):244-55.
  • [24]Dabney J, Meyer M, Pääbo S: Ancient DNA damage. Cold Spring Harbor Perspectives in Biology 2013, 5:7.
  • [25]Orlando L, Ginolhac A, Raghavan M, Vilstrup J, Rasmussen M, Magnussen K, et al.: True single-molecule DNA sequencing of a pleistocene horse bone. Genome Res 2011, 21(10):1705-19.
  • [26]Sawyer S, Krause J, Guschanksi K, Savolainen V, Pääbo S: Temporal Patterns of Nucleotide Misincorporations and DNA fragmentation in Ancient DNA. PLoS ONE 2012, 7(3):e34131.
  • [27]Briggs AW, Stenzel U, Johnson PLF, Green RE, Kelso J, Prüfer K, et al.: Patterns of damage in genomic DNA sequences from a Neandertal. Proc Acad Nat Sci Phila 2007, 104(37):14616-21.
  • [28]Sequin-Orlando A, Schubert M, Clary J, Stagegaard J, Alberdi MT, Prado JL, et al.: Ligation bias in illumina next-generation DNA libraries: implications for sequencing ancient genomes. PLoS ONE 2013, 8(10):e78575.
  • [29]Li R, Li Y, Kristiansen K, Wang J: SOAP: short oligonucleotide alignment program. Bioinformatics 2008, 24(5):713-4.
  • [30]Kerpedjiev P, Frellsen J, Lindgreen S, Krogh A: Adaptable probabilistic mapping of short reads using position specific scoring matrices. BMC Bioinformatics 2014, 15:100. BioMed Central Full Text
  • [31]Ruffalo M, LaFramboise T, Koyutürk M: Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011, 27(20):2790-6.
  • [32]Schubert M, Ginolhac A, Lindgreen S, Thompson JF, Al-Rasheid KA, Willerslev E, et al.: Improving ancient DNA read mapping against modern reference genomes. BMC Genomics 2012, 13:178. BioMed Central Full Text
  • [33]Hatem A, Bozdag D, Toland AE, Catalyurek UV: Benchmarking short sequence mapping tools. BMC Bioinformatics 2013, 14:184. BioMed Central Full Text
  • [34]Liu Q, Guo Y, Li J, Long J, Zhang B, Shyr Y: Steps to ensure accuracy in genotype and SNP calling from illumina sequencing data. BMC Genomics 2014, 13(Suppl 8):S8. BioMed Central Full Text
  • [35]Li H, Homer N: A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinf 2010, 11(5):473-83.
  • [36]Boland JF, Chung CC, Roberson D, Mitchell J, Zhang X, Im KM, et al.: The new sequencer on the block: comparison of Life Technology’s Proton sequencer to an Illumina HiSeq for whole-exome sequencing. Hum Genet 2013, 1:11.
  • [37]Barrick JE, Yu DS, Yoon SH, Jeong H, Oh TK, Schneider D, et al.: Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 2009, 461(7268):1243-7.
  • [38]Ben Rhouma F, Azzouz H, Petit F, Khelifa M, Chehida A, Nasrallah F, et al.: Molecular and biochemical characterization of a novel intronic single point mutation in a Tunisian family with glycogen storage disease type III. Mol Biol Rep 2013, 1:6.
  • [39]Nicholson SJ, Hoecker U, Srivastava V: A novel phytochrome B allele in Arabidopsis thaliana exhibits partial mutant phenotype: a short deletion in N-terminal extension reduces phytochrome B activity. Plant Growth Regulation 2011, 65:207-12.
  • [40]Solomon DA, Kim T, Diaz-Martinez LA, Fair J, Elkahloun AG, Harris BT, et al.: Mutational inactivation of STAG2 causes aneuplooidy in human cancer. Science 2011, 333:1039-43.
  • [41]Vidal O, Araguas RM, Fernández R, Heras S, Sanz N, Pla C: Melanism in guinea fowl (Numida meleagris) is associated with a deletion of Phenylalanine-256 in the MC1R gene. Animal Gen 2010, 41(6):656-8.
  • [42]Vital A, Sole G, Casenave P, Magdelaine C, Ferrer X, Vital C, et al.: Severe Charcot-Marie-Tooth disease type 1E caused by a novel p.Phe84Leufs*24 PMP22 point mutation. J Peripher Nerv Syst 2013, 18(2):181-4. doi:10.1111/jns5.12028
  • [43]Britten RJ, Rowen L, Williams J, Cameron RA: Majority of divergence between closely related DNA samples is due to indels. Proc Acad Nat Sci Phila 2003, 100(8):4661-5.
  • [44]Denver DR, Morris K, Lynch M, Thomas WK: High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature 2004, 430(7000):679-82.
  • [45]Fortes GG, Speller CF, Hofreiter M, King TE: Phenotypes from ancient DNA: approaches, insights and prospects. Bioessays 2013, 35(8):690-5. doi:10.1002/bies.201300036
  • [46]Li H, Durbin R: Inference of human population history from individual whole-genome sequences. Nature 2011, 475(7357):493-6.
  • [47]Schiffels S, Durbin R: Inferring human population size and saparation history form multiple genome sequences. Nature Genetics 2014, 46(8):919-25.
  • [48]Mardis ER: Next-generation sequencing platforms. Annu Rev Analytical Chem 2013, 6:287-303.
  • [49]Briggs AW, Good JM, Green RE, Krause J, Maricic T, Stenzel U, et al.: Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 2009, 325(5938):318-21.
  • [50]Carpenter ML, Buenrostro JD, Valdiosera C, Schroeder H, Allentoft ME, Sikora M, et al.: Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries. Am J Hum Gen 2013, 93:1-13.
  • [51]Jónsson H, Ginolhac A, Schubert M, Johnson PLF, Orlando L: mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 2013, 29(13):1682-4.
  • [52]Schubert M, Ermini L, Sarkissian CD, Jónsson H, Ginolhac A, Schaefer R, et al.: Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nat Protocols 2014, 9(5):1056-82.
  • [53]Fletcher W, Yang Z: INDELible: a flexible simulator of biological sequence evolution. Mol Biol Evol 2009, 26(8):1879-88.
  • [54]Hasegawa M, Kishino H, Yano TA: Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 1985, 22(2):160-74.
  • [55]Huang W, Li L, Myers JR, Marth GT: ART: a next-generation sequencing read simulator. Bioinformatics 2012, 28(4):593-4.
  • [56]Zhang J, Chiodini R, Badr A, Zhang G: The impact of next-generation sequencing on genomics. J Gen Genom 2011, 38(3):95-109.
  • [57]Knapp M, Hofreiter M: Next generation sequencing of ancient DNA: requirements, strategies and perspectives. Genes 2010, 1:227-43.
  • [58]Krause J, Briggs AW, Kircher M, Maricic T, Zwyns N, Derevianko A, et al.: A complete mtDNA genome of an early modern human from Kostenki, Russia. Curr Biol 2010, 20(3):231-6.
  • [59]Li H, Durbin R: Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics 2009, 25(14):1754-60.
  • [60]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al.: The sequence alignment/Map format and SAMtools. Bioinformatics 2009, 25(16):2078-9.
  • [61]McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al.: The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20(9):1297-303.
  • [62]Bos KI, Schuenemann VJ, Golding GB, Burbano HA, Waglechner N, Coombes BK, et al.: A draft genome of Yersinia pestis from victims of the Black Death. Nature 2011, 478(7370):506-10.
  • [63]Ginolhac A, Vilstrup J, Stenderup J, Rasmussen M, Stiller M, Shapiro B, et al.: Improving the performance of true single molecule sequencing for ancient DNA. BMC Genom 2012, 13:177. BioMed Central Full Text
  • [64]Rasmussen M, Guo X, Wang Y, Lohmueller KE, Rasmussen S, Albrechtsen A, et al.: An aboriginal Australian genome reveals separate human dispersals into Asia. Science 2011, 334(6052):94-8.
  • [65]Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, et al.: Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 2010, 468(7327):1053-60.
  • [66]Bray N, Pachter L: MAVID multiple alignment server. Nucleic Acids Res 2003, 31(13):3525-6.
  文献评价指标  
  下载次数:31次 浏览次数:14次