BMC Genomics | |
Exome capture from saliva produces high quality genomic and metagenomic data | |
Brenna M Henn1,11  Carlos D Bustamante3  Jeffrey D Wall1  Katherine S Pollard4  Marcus W Feldman7  Eileen G Hoal1,10  Peter Parham9  Xiao Liu5  Yingrui Li5  Qiang Feng5  Xiaosen Guo5  Moraima Guadalupe2  Alexandra Adams3  Neda Nemat-Gorgani9  Christopher R Gignoux6  Martin Sikora3  Meredith L Carpenter3  Alicia R Martin3  Paul J Norman9  Dean Bobo1,11  Thomas J Sharpton1,12  Jeffrey M Kidd8  | |
[1] Institute for Human Genetics, and the Departments of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94143, USA;Agilent Technologies, Genomics Division, Cedar Creek, TX 78612, USA;Department of Genetics, Stanford University, Stanford, CA 94305, USA;The J. David Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA;Translational Medicine, BGI – Shenzhen, Shenzhen, China;Program in Pharmaceutical Sciences and Pharmacogenomics, University of California, San Francisco, CA 94143, USA;Department of Biological Sciences, Stanford University, Stanford, CA 94305, USA;Departments of Human Genetics, and Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA;Department of Structural Biology, Stanford University, Stanford, CA 94305, USA;Stellenbosch University, Tygerberg, South Africa;Department of Ecology and Evolution, Stony Brook University, Life Sciences Bldg, Room 640, Stony Brook, NY 11794, USA;Departments of Microbiology, and Statistics, Oregon State University, Corvallis, OR 97331, USA | |
关键词: Microbiome; Metagenomics; Genetic diversity; KhoeSan; Exomes; | |
Others : 1217535 DOI : 10.1186/1471-2164-15-262 |
|
received in 2014-03-18, accepted in 2014-03-28, 发布年份 2014 | |
【 摘 要 】
Background
Targeted capture of genomic regions reduces sequencing cost while generating higher coverage by allowing biomedical researchers to focus on specific loci of interest, such as exons. Targeted capture also has the potential to facilitate the generation of genomic data from DNA collected via saliva or buccal cells. DNA samples derived from these cell types tend to have a lower human DNA yield, may be degraded from age and/or have contamination from bacteria or other ambient oral microbiota. However, thousands of samples have been previously collected from these cell types, and saliva collection has the advantage that it is a non-invasive and appropriate for a wide variety of research.
Results
We demonstrate successful enrichment and sequencing of 15 South African KhoeSan exomes and 2 full genomes with samples initially derived from saliva. The expanded exome dataset enables us to characterize genetic diversity free from ascertainment bias for multiple KhoeSan populations, including new exome data from six HGDP Namibian San, revealing substantial population structure across the Kalahari Desert region. Additionally, we discover and independently verify thirty-one previously unknown KIR alleles using methods we developed to accurately map and call the highly polymorphic HLA and KIR loci from exome capture data. Finally, we show that exome capture of saliva-derived DNA yields sufficient non-human sequences to characterize oral microbial communities, including detection of bacteria linked to oral disease (e.g. Prevotella melaninogenica). For comparison, two samples were sequenced using standard full genome library preparation without exome capture and we found no systematic bias of metagenomic information between exome-captured and non-captured data.
Conclusions
DNA from human saliva samples, collected and extracted using standard procedures, can be used to successfully sequence high quality human exomes, and metagenomic data can be derived from non-human reads. We find that individuals from the Kalahari carry a higher oral pathogenic microbial load than samples surveyed in the Human Microbiome Project. Additionally, rare variants present in the exomes suggest strong population structure across different KhoeSan populations.
【 授权许可】
2014 Kidd et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150707025135607.pdf | 1889KB | download | |
Figure 6. | 37KB | Image | download |
Figure 5. | 54KB | Image | download |
Figure 4. | 22KB | Image | download |
Figure 3. | 36KB | Image | download |
Figure 2. | 97KB | Image | download |
Figure 1. | 105KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
【 参考文献 】
- [1]Liu J, Morgan M, Hutchison K, Calhoun VD: A study of the influence of sex on genome wide methylation. PLoS One 2010, 5(4):e10028.
- [2]Henn BM, Gignoux CR, Jobin M, Granka JM, Macpherson JM, Kidd JM, Rodríguez-Botigué L, Ramachandran S, Hon L, Brisbin A, Lin AA, Underhill PA, Comas D, Kidd KK, Norman PJ, Parham P, Bustamante CD, Mountain JL, Feldman MW: Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc Natl Acad Sci U S A 2011, 108(13):5154-5162.
- [3]Kurek KC, Luks VL, Ayturk UM, Alomari AI, Fishman SJ, Spencer SA, Mulliken JB, Bowen ME, Yamamoto GL, Kozakewich HP, Warman ML: Somatic mosaic activating mutations in PIK3CA cause CLOVES syndrome. Am J Hum Genet 2012, 90(6):1108-1115.
- [4]Deng X: SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data. BMC Bioinforma 2011, 12:267. BioMed Central Full Text
- [5]Shearer AE, Hildebrand MS, Smith RJ: Solution-based targeted genomic enrichment for precious DNA samples. BMC Biotechnol 2012, 12:20. BioMed Central Full Text
- [6]Kitzman JO, Snyder MW, Ventura M, Lewis AP, Qiu R, Simmons LE, Gammill HS, Rubens CE, Santillan DA, Murray JC, Tabor HK, Bamshad MJ, Eichler EE, Shendure J: Noninvasive whole-genome sequencing of a human fetus. Sci Transl Med 2012, 4(137):76.
- [7]Patel ZH, Kottyan LC, Lazaro S, Williams MS, Ledbetter DH, Tromp H, Rupert A, Kohram M, Wagner M, Husami A, Qian Y, Valencia CA, Zhang K, Hostetter MK, Harley JB, Kaufman KM: The struggle to find reliable results in exome sequencing data: filtering out Mendelian errors. Front Genet 2014, 5:16.
- [8]Teer JK, Mullikin JC: Exome sequencing: the sweet spot before whole genomes. Hum Mol Genet 2010, 19(R2):R145-R151.
- [9]Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, Shendure J: Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet 2011, 12(11):745-755.
- [10]Bustamante CD, Fledel-Alon A, Williamson S, Nielsen R, Hubisz MT, Glanowski S, Tanenbaum DM, White TJ, Sninsky JJ, Hernandez RD, Civello D, Adams MD, Cargill M, Clark AG: Natural selection on protein-coding genes in the human genome. Nature 2005, 437(7062):1153-1157.
- [11]Tennessen JA, Madeoy J, Akey JM: Signatures of positive selection apparent in a small sample of human exomes. Genome Res 2010, 20(10):1327-1334.
- [12]Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZX, Pool JE, Xu X, Jiang H, Vinckenbosch N, Korneliussen TS, Zheng H, Liu T, He W, Li K, Luo R, Nie X, Wu H, Zhao M, Cao H, Zou J, Shan Y, Li S, Yang Q, Asan , Ni P, Tian G, Xu J, Liu X, Jiang T, Wu R, et al.: Sequencing of 50 human exomes reveals adaptation to high altitude. Science 2010, 329(5987):75-78.
- [13]Rylander-Rudqvist T, Håkansson N, Tybring G, Wolk A: Quality and quantity of saliva DNA obtained from the self-administrated oragene method–a pilot study on the cohort of Swedish men. Cancer Epidemiol Biomarkers Prev 2006, 15(9):1742-1745.
- [14]Hansen TV, Simonsen MK, Nielsen FC, Hundrup YA: Collection of blood, saliva, and buccal cell samples in a pilot study on the Danish nurse cohort: comparison of the response rate and quality of genomic DNA. Cancer Epidemiol Biomarkers Prev 2007, 16(10):2072-2076.
- [15]Schuster SC, Miller W, Ratan A, Tomsho LP, Giardine B, Kasson LR, Harris RS, Petersen DC, Zhao F, Qi J, Alkan C, Kidd JM, Sun Y, Drautz DI, Bouffard P, Muzny DM, Reid JG, Nazareth LV, Wang Q, Burhans R, Riemer C, Wittekindt NE, Moorjani P, Tindall EA, Danko CG, Teo WS, Buboltz AM, Zhang Z, Ma Q, Oosthuysen A, et al.: Complete Khoisan and Bantu genomes from southern Africa. Nature 2010, 463(7283):943-947.
- [16]Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A: Bayesian inference of ancient human demography from individual genome sequences. Nat Genet 2011, 43(10):1031-1034.
- [17]Asan , Xu Y, Jiang H, Tyler-Smith C, Xue Y, Jiang T, Wang J, Wu M, Liu X, Tian G, Wang J, Wang J, Yang H, Zhang X: Comprehensive comparison of three commercial human whole-exome capture platforms. Genome Biol 2011, 12(9):R95. BioMed Central Full Text
- [18]Clark MJ, Chen R, Lam HY, Karczewski KJ, Chen R, Euskirchen G, Butte AJ, Snyder M: Performance comparison of exome DNA sequencing technologies. Nat Biotechnol 2011, 29(10):908-914.
- [19]Briggs AW, Stenzel U, Johnson PL, Green RE, Kelso J, Prüfer K, Meyer M, Krause J, Ronan MT, Lachmann M, Pääbo S: Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci U S A 2007, 104(37):14616-14621.
- [20]Stoneking M, Krause J: Learning about human population history from ancient and modern genomes. Nat Rev Genet 2011, 12(9):603-614.
- [21]Ginolhac A, Rasmussen M, Gilbert MT, Willerslev E, Orlando L: mapDamage: testing for damage patterns in ancient DNA sequences. Bioinformatics 2011, 27(15):2153-2155.
- [22]Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu F, Peltonen L, Dermitzakis E, Bonnen PE, Altshuler DM, Gibbs RA, de Bakker PI, Deloukas P, Gabriel SB, Gwilliam R, Hunt S, Inouye M, Jia X, Palotie A, Parkin M, Whittaker P, Yu F, Chang K, Hawes A, Lewis LR, Ren Y, International HapMap 3 Consortium, et al.: Integrating common and rare genetic variation in diverse human populations. Nature 2010, 467(7311):52-58.
- [23]Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA, 1000 Genomes Project Consortium: An integrated map of genetic variation from 1,092 human genomes. Nature 2012, 491(7422):56-65.
- [24]DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011, 43(5):491-498.
- [25]McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20(9):1297-1303.
- [26]Schlebusch CM, Skoglund P, Sjödin P, Gattepaille LM, Hernandez D, Jay F, Li S, De Jongh M, Singleton A, Blum MG, Soodyall H, Jakobsson M: Genomic variation in seven Khoe-San groups reveals adaptation and complex African history. Science 2012, 338(6105):374-379.
- [27]Pickrell JK, Patterson N, Barbieri C, Berthold F, Gerlach L, Güldemann T, Kure B, Mpoloka SW, Nakagawa H, Naumann C, Lipson M, Loh PR, Lachance J, Mountain J, Bustamante CD, Berger B, Tishkoff SA, Henn BM, Stoneking M, Reich D, Pakendorf B: The genetic prehistory of southern Africa. Nat Commun 2012, 3:1143.
- [28]Patterson N, Price AL, Reich D: Population structure and eigenanalysis. PLoS Genet 2006, 2(12):e190.
- [29]Parham P: MHC class I molecules and KIRs in human history, health and survival. Nat Rev Immunol 2005, 5(3):201-214.
- [30]Parham P, Norman PJ, Abi-Rached L, Hilton HG, Guethlein LA: Review: immunogenetics of human placentation. Placenta 2012, 33(Suppl):S71-S80.
- [31]Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, Yooseph S, Wu D, Eisen JA, Hoffman JM, Remington K, Beeson K, Tran B, Smith H, Baden-Tillson H, Stewart C, Thorpe J, Freeman J, Andrews-Pfannkoch C, Venter JE, Li K, Kravitz S, Heidelberg JF, Utterback T, Rogers YH, Falcón LI, Souza V, Bonilla-Rosso G, Eguiarte LE, Karl DM, Sathyendranath S, et al.: The sorcerer II global ocean Sampling expedition: Northwest Atlantic through eastern tropical Pacific. PLoS Biol 2007, 5(3):e77.
- [32]Konstantinidis KT, Ramette A, Tiedje JM: The bacterial species definition in the genomic era. Philos Trans R Soc Lond B Biol Sci 2006, 361(1475):1929-1940.
- [33]Yanagisawa M, Kuriyama T, Williams DW, Nakagawa K, Karasawa T: Proteinase activity of prevotella species associated with oral purulent infection. Curr Microbiol 2006, 52(5):375-378.
- [34]Peng Z, Fives-Taylor P, Ruiz T, Zhou M, Sun B, Chen Q, Wu H: Identification of critical residues in Gap3 of Streptococcus parasanguinis involved in Fap1 glycosylation, fimbrial formation and in vitro adhesion. BMC Microbiol 2008, 8:52. BioMed Central Full Text
- [35]Ohara-Nemoto Y, Kishi K, Satho M, Tajika S, Sasaki M, Namioka A, Kimura S: Infective endocarditis caused by Granulicatella elegans originating in the oral cavity. J Clin Microbiol 2005, 43(3):1405-1407.
- [36]Gibson FC 3rd, Hong C, Chou HH, Yumoto H, Chen J, Lien E, Wong J, Genco CA: Innate immune recognition of invasive bacteria accelerates atherosclerosis in apolipoprotein E-deficient mice. Circulation 2004, 109(22):2801-2806.
- [37]Zeituni AE, Carrion J, Cutler CW: Porphyromonas gingivalis-dendritic cell interactions: consequences for coronary artery disease. J Oral Microbiol 2010, 2:5782.
- [38]Ihara H, Miura T, Kato T, Ishihara K, Nakagawa T, Yamada S, Okuda K: Detection of Campylobacter rectus in periodontitis sites by monoclonal antibodies. J Periodontal Res 2003, 38(1):64-72.
- [39]Chimusa ER, Zaitlen N, Daya M, Möller M, van Helden PD, Mulder NJ, Price AL, Hoal EG: Genome-wide association study of ancestry-specific TB risk in the South African Coloured population. Hum Mol Genet 2013, 23:796. doi: 10.1093/hmg/ddt462
- [40]Koren O, Knights D, Gonzalez A, Waldron L, Segata N, Knight R, Huttenhower C, Ley RE: A guide to enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets. PLoS Comput Biol 2013, 9(1):e1002863.
- [41]Human Microbiome Project Consortium: A framework for human microbiome research. Nature 2012, 486(7402):215-21.
- [42]Hodzic E, Snyder S: A case of peritonitis due to Rothia mucilaginosa. Perit Dial Int 2010, 30(3):379-380.
- [43]Pinsky RL, Piscitelli V, Patterson JE: Endocarditis caused by relatively penicillin-resistant Stomatococcus mucilaginosus. J Clin Microbiol 1989, 27(1):215-216.
- [44]Liu Y, Li J: Short regions of sequence identity between the genomes of bacteria and human. Curr Microbiol 2011, 62(3):770-776.
- [45]Bodi K, Perera AG, Adams PS, Bintzler D, Dewar K, Grove DS, Kieleczawa J, Lyons RH, Neubert TA, Noll AC, Singh S, Steen R, Zianni M: Comparison of commercially available target enrichment methods for next-generation sequencing. J Biomol Tech 2013, 24(2):73-86.
- [46]Parla JS, Iossifov I, Grabill I, Spector MS, Kramer M, McCombie WR: A comparative analysis of exome capture. Genome Biol 2011, 12(9):R97. BioMed Central Full Text
- [47]Larkin JM, Strohl WR: Beggiatoa, Thiothrix, and Thioploca. Annu Rev Microbiol 1983, 37:341-367.
- [48]Lazarevic V, Whiteson K, Hernandez D, François P, Schrenzel J: Study of inter- and intra-individual variations in the salivary microbiota. BMC Genomics 2010, 11:523. BioMed Central Full Text
- [49]Consortium HMP: Structure, function and diversity of the healthy human microbiome. Nature 2012, 486(7402):207-214.
- [50]Henderson B, Ward JM, Ready D: Aggregatibacter (Actinobacillus) actinomycetemcomitans: a triple A* periodontopathogen? Periodontol 2000 2010, 54(1):78-105.
- [51]Nasidze I, Li J, Schroeder R, Creasey JL, Li M, Stoneking M: High diversity of the saliva microbiome in Batwa Pygmies. PLoS One 2011, 6(8):e23352.
- [52]Nasidze I, Li J, Quinque D, Tang K, Stoneking M: Global diversity in the human salivary microbiome. Genome Res 2009, 19(4):636-643.
- [53]Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, Magris M, Hidalgo G, Baldassano RN, Anokhin AP, Heath AC, Warner B, Reeder J, Kuczynski J, Caporaso JG, Lozupone CA, Lauber C, Clemente JC, Knights D, Knight R, Gordon JI: Human gut microbiome viewed across age and geography. Nature 2012, 486(7402):222-227.
- [54]Consortium GP: A map of human genome variation from population-scale sequencing. Nature 2010, 467(7319):1061-1073.
- [55]Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25(14):1754-1760.
- [56]Lassmann T, Hayashizaki Y, Daub CO: SAMStat: monitoring biases in next generation sequencing data. Bioinformatics 2011, 27(1):130-131.
- [57]Cann HM, de Toma C, Cazes L, Legrand MF, Morel V, Piouffre L, Bodmer J, Bodmer WF, Bonne-Tamir B, Cambon-Thomsen A, Chen Z, Chu J, Carcassi C, Contu L, Du R, Excoffier L, Ferrara GB, Friedlaender JS, Groot H, Gurwitz D, Jenkins T, Herrera RJ, Huang X, Kidd J, Kidd KK, Langaney A, Lin AA, Mehdi SQ, Parham P, Piazza A: A human genome diversity cell line panel. Science 2002, 296(5566):261-262.
- [58]Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, 1000 Genomes Project Analysis Group: The variant call format and VCFtools. Bioinformatics 2011, 27(15):2156-2158.
- [59]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup: The sequence alignment/map format and SAMtools. Bioinformatics 2009, 25(16):2078-2079.
- [60]Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 2009, 10(3):R25. BioMed Central Full Text
- [61]Wilson MJ, Torkar M, Haude A, Milne S, Jones T, Sheer D, Beck S, Trowsdale J: Plasticity in the organization and sequences of human KIR/ILT gene families. Proc Natl Acad Sci U S A 2000, 97(9):4778-4783.
- [62]Pyo CW, Guethlein LA, Vu Q, Wang R, Abi-Rached L, Norman PJ, Marsh SG, Miller JS, Parham P, Geraghty DE: Different patterns of evolution in the centromeric and telomeric regions of group A and B haplotypes of the human killer cell Ig-like receptor locus. PLoS One 2010, 5(12):e15115.
- [63]Robinson J, Mistry K, McWilliam H, Lopez R, Marsh SG: IPD–the immuno polymorphism database. Nucleic Acids Res 2010, 38(Database issue):D863-D869.
- [64]Norman PJ, Hollenbach JA, Nemat-Gorgani N, Guethlein LA, Hilton HG, Pando MJ, Koram KA, Riley EM, Abi-Rached L, Parham P: Co-evolution of human leukocyte antigen (HLA) class I ligands with killer-cell immunoglobulin-like receptors (KIR) in a genetically diverse population of sub-Saharan Africans. PLoS Genet 2013, 9(10):e1003938.
- [65]Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Müller WE, Wetter T, Suhai S: Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res 2004, 14(6):1147-1159.
- [66]Staden R, Beal KF, Bonfield JK: The Staden package, 1998. Methods Mol Biol 2000, 132:115-130.
- [67]Schmieder R, Edwards R: Quality control and preprocessing of metagenomic datasets. Bioinformatics 2011, 27(6):863-864.