BMC Genomics | |
Flexible and scalable genotyping-by-sequencing strategies for population studies | |
Stephen L Dellaporta2  Joe Tohme1  Hongyu Zhao5  John P Mottinger3  John D Overton4  Maria A Moreno2  Christopher A Fragoso5  Christopher Heffelfinger2  | |
[1] Agrobiodiversity Research Area, Centro Internacional de Agricultura Tropical (CIAT), A.A. 6713, Cali, Colombia;Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06511, USA;Department of Cell and Molecular Biology, University of Rhode Island, Kingston, RI 02881, USA;Current: Regeneron Genetics Center, Regeneron, Tarrytown, NY 10591, USA;Department of Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520-8034, USA | |
关键词: Agricultural genomics; Plant breeding; Trait mapping; Population genomics; Reduced representation sequencing; GBS; Genotyping; | |
Others : 1092451 DOI : 10.1186/1471-2164-15-979 |
|
received in 2014-05-28, accepted in 2014-10-23, 发布年份 2014 | |
【 摘 要 】
Background
Many areas critical to agricultural production and research, such as the breeding and trait mapping in plants and livestock, require robust and scalable genotyping platforms. Genotyping-by-sequencing (GBS) is a one such method highly suited to non-human organisms. In the GBS protocol, genomic DNA is fractionated via restriction digest, then reduced representation is achieved through size selection. Since many restriction sites are conserved across a species, the sequenced portion of the genome is highly consistent within a population. This makes the GBS protocol highly suited for experiments that require surveying large numbers of markers within a population, such as those involving genetic mapping, breeding, and population genomics. We have modified the GBS technology in a number of ways. Custom, enzyme specific adaptors have been replaced with standard Illumina adaptors compatible with blunt-end restriction enzymes. Multiplexing is achieved through a dual barcoding system, and bead-based library preparation protocols allows for in-solution size selection and eliminates the need for columns and gels.
Results
A panel of eight restriction enzymes was selected for testing on B73 maize and Nipponbare rice genomic DNA. Quality of the data was demonstrated by identifying that the vast majority of reads from each enzyme aligned to restriction sites predicted in silico. The link between enzyme parameters and experimental outcome was demonstrated by showing that the sequenced portion of the genome was adaptable by selecting enzymes based on motif length, complexity, and methylation sensitivity. The utility of the new GBS protocol was demonstrated by correctly mapping several in a maize F2 population resulting from a B73 × Country Gentleman test cross.
Conclusions
This technology is readily adaptable to different genomes, highly amenable to multiplexing and compatible with over forty commercially available restriction enzymes. These advancements represent a major improvement in genotyping technology by providing a highly flexible and scalable GBS that is readily implemented for studies on genome-wide variation.
【 授权许可】
2014 Heffelfinger et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150128184403468.pdf | 2795KB | download | |
Figure 11. | 80KB | Image | download |
Figure 10. | 58KB | Image | download |
Figure 9. | 93KB | Image | download |
Figure 1. | 29KB | Image | download |
Figure 7. | 88KB | Image | download |
Figure 6. | 36KB | Image | download |
Figure 5. | 23KB | Image | download |
Figure 4. | 56KB | Image | download |
Figure 3. | 52KB | Image | download |
Figure 2. | 37KB | Image | download |
Figure 1. | 40KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 1.
Figure 9.
Figure 10.
Figure 11.
【 参考文献 】
- [1]Lachance J, Tishkoff SA: SNP ascertainment bias in population genetic analyses: why it is important, and how to correct it. BioEssays 2013, 35(9):780-786.
- [2]Metzker ML: Sequencing technologies - the next generation. Nat Rev Genet 2010, 11(1):31-46.
- [3]Heslot N, Rutkoski J, Poland J, Jannink JL, Sorrells ME: Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity. PLoS ONE 2013, 8(9):e74612.
- [4]Loman NJ, Misra RV, Dallman TJ, Constantinidou C, Gharbia SE, Wain J, Pallen MJ: Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol 2012, 30(5):434-439.
- [5]Lam HY, Clark MJ, Chen R, Natsoulis G, O’Huallachain M, Dewey FE, Habegger L, Ashley EA, Gerstein MB, Butte AJ, Ji HP, Snyder M: Performance comparison of whole-genome sequencing platforms. Nat Biotechnol 2012, 30(1):78-82.
- [6]Sousa V, Hey J: Understanding the origin of species with genome-scale data: modelling gene flow. Nat Rev Genet 2013, 14(6):404-414.
- [7]Nekrutenko A, Taylor J: Next-generation sequencing data interpretation: enhancing reproducibility and accessibility. Nat Rev Genet 2012, 13(9):667-672.
- [8]Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA: A map of human genome variation from population-scale sequencing. Nature 2010, 467(7319):1061-1073.
- [9]Nielsen R, Paul JS, Albrechtsen A, Song YS: Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 2011, 12(6):443-451.
- [10]Ruffalo M, LaFramboise T, Koyuturk M: Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011, 27(20):2790-2796.
- [11]Li H, Homer N: A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 2010, 11(5):473-483.
- [12]Li Y, Sidore C, Kang HM, Boehnke M, Abecasis GR: Low-coverage sequencing: implications for design of complex trait association studies. Genome Res 2011, 21(6):940-951.
- [13]Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA: An integrated map of genetic variation from 1,092 human genomes. Nature 2012, 491(7422):56-65.
- [14]Wu X, Ren C, Joshi T, Vuong T, Xu D, Nguyen HT: SNP discovery by high-throughput sequencing in soybean. BMC Genomics 2010, 11:469. BioMed Central Full Text
- [15]Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, Shendure J: Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet 2011, 12(11):745-755.
- [16]Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, Bamshad M, Nickerson DA, Shendure J: Targeted capture and massively parallel sequencing of 12 human exomes. Nature 2009, 461(7261):272-276.
- [17]Turner EH, Ng SB, Nickerson DA, Shendure J: Methods for genomic partitioning. Annu Rev Genomics Hum Genet 2009, 10:263-284.
- [18]Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE: A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 2011, 6(5):e19379.
- [19]Miller MR, Dunham JP, Amores A, Cresko WA, Johnson EA: Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers. Genome Res 2007, 17(2):240-248.
- [20]Van Tassell CP, Smith TP, Matukumalli LK, Taylor JF, Schnabel RD, Lawley CT, Haudenschild CD, Moore SS, Warren WC, Sonstegard TS: SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat Methods 2008, 5(3):247-252.
- [21]Greminger MP, Stolting KN, Nater A, Goossens B, Arora N, Bruggmann R, Patrignani A, Nussberger B, Sharma R, Kraus RH, Ambu LN, Singleton I, Chikhi L, Van Schaik CP, Krutzen M: Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms. BMC Genomics 2014, 15:16. BioMed Central Full Text
- [22]Poland JA, Rife TW: Genotyping-by-sequencing for plant breeding and genetics. Plant Genome-Us 2012, 5(3):92-102.
- [23]Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE: Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS ONE 2012, 7(5):e37135.
- [24]Wang S, Meyer E, McKay JK, Matz MV: 2b-RAD: a simple and flexible method for genome-wide genotyping. Nat Methods 2012, 9(8):808-810.
- [25]Romay MC, Millard MJ, Glaubitz JC, Peiffer JA, Swarts KL, Casstevens TM, Elshire RJ, Acharya CB, Mitchell SE, Flint-Garcia SA, McMullen MD, Holland JB, Buckler ES, Gardner CA: Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol 2013, 14(6):R55. BioMed Central Full Text
- [26]Chia JM, Song C, Bradbury PJ, Costich D, de Leon N, Doebley J, Elshire RJ, Gaut B, Geller L, Glaubitz JC, Gore M, Guill KE, Holland J, Hufford MB, Lai J, Li M, Liu X, Lu Y, McCombie R, Nelson R, Poland J, Prasanna BM, Pyhajarvi T, Rong T, Sekhon RS, Sun Q, Tenaillon MI, Tian F, Wang J, Xu X, et al.: Maize HapMap2 identifies extant variation from a genome in flux. Nat Genet 2012, 44(7):803-807.
- [27]Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF, Gaut BS: Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc Natl Acad Sci U S A 2001, 98(16):9161-9166.
- [28]Sonah H, Bastien M, Iquira E, Tardivel A, Legare G, Boyle B, Normandeau E, Laroche J, Larose S, Jean M, Belzile F: An improved genotyping by sequencing (GBS) approach offering increased versatility and efficiency of SNP discovery and genotyping. PLoS ONE 2013, 8(1):e54603.
- [29]Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, Buckler ES: TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLoS ONE 2014, 9(2):e90346.
- [30]Poland JA, Brown PJ, Sorrells ME, Jannink JL: Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. PLoS ONE 2012, 7(2):e32253.
- [31]Spindel J, Wright M, Chen C, Cobb J, Gage J, Harrington S, Lorieux M, Ahmadi N, McCouch S: Bridging the genotyping gap: using genotyping by sequencing (GBS) to add high-density SNP markers and new value to traditional bi-parental mapping and breeding populations. Theor Appl Genet 2013, 126(11):2699-2716.
- [32]Lamble S, Batty E, Attar M, Buck D, Bowden R, Lunter G, Crook D, El-Fahmawi B, Piazza P: Improved workflows for high throughput library preparation using the transposome-based nextera system. BMC Biotechnol 2013, 13:104. BioMed Central Full Text
- [33]Hawkins TL, O’Connor-Morin T, Roy A, Santillan C: DNA purification and isolation using a solid-phase. Nucleic Acids Res 1994, 22(21):4543-4544.
- [34]Fisher S, Barry A, Abreu J, Minie B, Nolan J, Delorey TM, Young G, Fennell TJ, Allen A, Ambrogio L, Berlin AM, Blumenstiel B, Cibulskis K, Friedrich D, Johnson R, Juhn F, Reilly B, Shammas R, Stalker J, Sykes SM, Thompson J, Walsh J, Zimmer A, Zwirko Z, Gabriel S, Nicol R, Nusbaum C: A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biol 2011, 12(1):R1. BioMed Central Full Text
- [35]Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, et al.: The B73 maize genome: complexity, diversity, and dynamics. Science 2009, 326(5956):1112-1115.
- [36]Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, Childs KL, Davidson RM, Lin H, Quesada-Ocampo L, Vaillancourt B, Sakai H, Lee SS, Kim J, Numa H, Itoh T, Buell CR, Matsumoto T: Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y) 2013, 6(1):4.
- [37]Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 2008, 18(11):1851-1858.
- [38]Benjamini Y, Hochberg Y: Controlling the false discovery rate - a practical and powerful approach to multiple testing. J Roy Stat Soc B Met 1995, 57(1):289-300.
- [39]Lu F, Lipka AE, Glaubitz J, Elshire R, Cherney JH, Casler MD, Buckler ES, Costich DE: Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol. PLoS Genet 2013, 9(1):e1003215.
- [40]Crossa J, Beyene Y, Kassa S, Perez P, Hickey JM, Chen C, de los Campos G, Burgueno J, Windhausen VS, Buckler E, Jannink JL, Lopez Cruz MA, Babu R: Genomic prediction in maize breeding populations with genotyping-by-sequencing. G3 (Bethesda) 2013, 3(11):1903-1926.
- [41]Liu H, Bayer M, Druka A, Russell JR, Hackett CA, Poland J, Ramsay L, Hedley PE, Waugh R: An evaluation of genotyping by sequencing (GBS) to map the Breviaristatum-e (ari-e) locus in cultivated barley. BMC Genomics 2014, 15:104. BioMed Central Full Text
- [42]Rabbi IY, Hamblin MT, Kumar PL, Gedil MA, Ikpan AS, Jannink JL, Kulakow PA: High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res 2014, 186:87-96.
- [43]Lado B, Matus I, Rodriguez A, Inostroza L, Poland J, Belzile F, del Pozo A, Quincke M, Castro M, von Zitzewitz J: Increased genomic prediction accuracy in wheat breeding through spatial adjustment of field trial data. G3 (Bethesda) 2013, 3(12):2105-2114.
- [44]Pingoud A, Jeltsch A: Structure and function of type II restriction endonucleases. Nucleic Acids Res 2001, 29(18):3705-3727.
- [45]Krueger F, Andrews SR, Osborne CS: Large scale loss of data in low-diversity illumina sequencing libraries can be recovered by deferred cluster calling. PLoS ONE 2011, 6(1):e16607.
- [46]Tenaillon MI, Hufford MB, Gaut BS, Ross-Ibarra J: Genome size and transposable element content as determined by high-throughput sequencing in maize and Zea luxurians. Genome Biol Evol 2011, 3:219-229.
- [47]Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, et al.: A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 2002, 296(5565):92-100.
- [48]Pritchard JK, Przeworski M: Linkage disequilibrium in humans: models and data. Am J Hum Genet 2001, 69(1):1-14.
- [49]Beissinger TM, Hirsch CN, Sekhon RS, Foerster JM, Johnson JM, Muttoni G, Vaillancourt B, Buell CR, Kaeppler SM, de Leon N: Marker density and read depth for genotyping populations using genotyping-by-sequencing. Genetics 2013, 193(4):1073-1081.
- [50]Li Y, Willer C, Sanna S, Abecasis G: Genotype imputation. Annu Rev Genomics Hum Genet 2009, 10:387-406.
- [51]Stephens M, Donnelly P: A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 2003, 73(5):1162-1169.
- [52]Scheet P, Stephens M: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 2006, 78(4):629-644.
- [53]Marchini J, Howie B: Genotype imputation for genome-wide association studies. Nat Rev Genet 2010, 11(7):499-511.
- [54]Rutkoski JE, Poland J, Jannink JL, Sorrells ME: Imputation of unordered markers and the impact on genomic selection accuracy. G3 (Bethesda) 2013, 3(3):427-439.
- [55]Stekhoven DJ, Buhlmann P: MissForest–non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28(1):112-118.
- [56]Huang BE, Raghavan C, Mauleon R, Broman KW, Leung H: Efficient imputation of missing markers in low-coverage genotyping-by-sequencing data from multi-parental crosses. Genetics 2014, 197(1):401-404.
- [57]Andolfatto P, Davison D, Erezyilmaz D, Hu TT, Mast J, Sunayama-Morita T, Stern DL: Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Genome Res 2011, 21(4):610-617.
- [58]Narum SR, Buerkle CA, Davey JW, Miller MR, Hohenlohe PA: Genotyping-by-sequencing in ecological and conservation genomics. Mol Ecol 2013, 22(11):2841-2847.
- [59]Chen J, Dellaporta SL: The Maize Handbook. In In The Maize Handbook. Edited by Freeling M, Walbot V. New York: Springer; 1994:526-528.
- [60]Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Methods 2012, 9(4):357-359.
- [61]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25(16):2078-2079.