BMC Genomics | |
Evaluating the accuracy of AIM panels at quantifying genome ancestry | |
Antonio Salas2  Federico Martinón-Torres1  Jacobo Pardo-Seco2  | |
[1] Unidad de Emergencias Pediátrica y Cuidados Intensivos, Departamento de Pediatría, Hospital Clínico Universitario de Santiago, Santiago de Compostela, Galicia, Spain;Grupo de Investigación en Genética, Vacunas, Infecciones y Pediatría (GENVIP), Hospital Clínico Universitario and Universidade de Santiago de Compostela (USC), Santiago de Compostela, Galicia, Spain | |
关键词: Ancestry; AIMs; SNPs; Genomics; | |
Others : 856714 DOI : 10.1186/1471-2164-15-543 |
|
received in 2014-03-10, accepted in 2014-06-19, 发布年份 2014 | |
【 摘 要 】
Background
There is a growing interest among geneticists in developing panels of Ancestry Informative Markers (AIMs) aimed at measuring the biogeographical ancestry of individual genomes. The efficiency of these panels is commonly tested empirically by contrasting self-reported ancestry with the ancestry estimated from these panels.
Results
Using SNP data from HapMap we carried out a simulation-based study aimed at measuring the effect of SNP coverage on the estimation of genome ancestry. For three of the main continental groups (Africans, East Asians, Europeans) ancestry was first estimated using the whole HapMap SNP database as a proxy for global genome ancestry; these estimates were subsequently compared to those obtained from pre-designed AIM panels. Panels that consider >400 AIMs capture genome ancestry reasonably well, while those containing a few dozen AIMs show a large variability in ancestry estimates. Curiously, 500-1,000 SNPs selected at random from the genome provide an unbiased estimate of genome ancestry and perform as well as any AIM panel of similar size. In simulated scenarios of population admixture, panels containing few AIMs also show important deficiencies to measure genome ancestry.
Conclusions
The results indicate that the ability to estimate genome ancestry is strongly dependent on the number of AIMs used, and not primarily on their individual informativeness. Caution should be taken when making individual (medical, forensic, or anthropological) inferences based on AIMs.
【 授权许可】
2014 Pardo-Seco et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140723035753360.pdf | 1680KB | download | |
98KB | Image | download | |
331KB | Image | download | |
200KB | Image | download | |
189KB | Image | download | |
140KB | Image | download | |
150KB | Image | download |
【 图 表 】
【 参考文献 】
- [1]Pääbo S: The mosaic that is our genome. Nature 2003, 421(6921):409-412.
- [2]Serre D, Pääbo S: Evidence for gradients of human genetic diversity within and among continents. Genome Res 2004, 14(9):1679-1685.
- [3]Reich D, Patterson N, Campbell D, Tandon A, Mazieres S, Ray N, Parra MV, Rojas W, Duque C, Mesa N, Mesa N, Garcia LF, Triana O, Blair S, Maestre A, Dib JC, Bravi CM, Bailliet G, Corach D, Hunemeier T, Bortolini MC, Salzano FM, Petzl-Erler ML, Acuna-Alonzo V, Aguilar-Salinas C, Canizales-Quinteros S, Tusie-Luna T, Riba L, Rodriguez-Cruz M, Lopez-Alarcon M, et al.: Reconstructing Native American population history. Nature 2012, 488(7411):370-374.
- [4]Novembre J, Johnson T, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KS, Bergmann S, Nelson MR, Stephens M, Bustamante CD: Genes mirror geography within Europe. Nature 2008, 456(7218):98-101.
- [5]Galanter JM, Fernández-López JC, Gignoux CR, Barnholtz-Sloan J, Fernandez-Rozadilla C, Via M, Hidalgo-Miranda A, Contreras AV, Figueroa LU, Raska P, Jimenez-Sanchez G, Silva Zolezzi I, Torres M, Ponte CR, Ruiz Y, Salas A, Nguyen E, Eng C, Borjas L, Zabala W, Barreto G, Rondon González F, Ibarra A, Taboada P, Porras L, Moreno F, Bigham A, Gutierrez G, Brutsaert T, Leon-Velarde F, et al.: Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genet 2012, 8(3):e1002554.
- [6]Montana G, Pritchard JK: Statistical tests for admixture mapping with case-control and cases-only data. Am J Hum Genet 2004, 75(5):771-789.
- [7]Rosenberg NA, Li LM, Ward R, Pritchard JK: Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 2003, 73(6):1402-1422.
- [8]Rodrigues AC, Perin PM, Purim SG, Silbiger VN, Genvigir FD, Willrich MA, Arazi SS, Luchessi AD, Hirata MH, Bernik MM, Dorea EL, Santos C, Faludi AA, Bertolami MC, Salas A, Freire A, Lareu MV, Phillips C, Porras-Hurtado L, Fondevila M, Carracedo A, Hirata RD: Pharmacogenetics of OATP transporters reveals that SLCO1B1 c.388A> G variant is determinant of increased atorvastatin response. Int J Mol Sci 2011, 12(9):5815-5827.
- [9]Taboada-Echalar P, Álvarez-Iglesias V, Heinz T, Vidal-Bralo L, Gómez-Carballa A, Catelli L, Pardo-Seco J, Pastoriza A, Carracedo Á, Torres-Balanza A, Rocabado O, Vullo C, Salas A: The genetic legacy of the pre-colonial period in contemporary Bolivians. PLoS One 2013, 8(3):e58980.
- [10]Cerezo M, Achilli A, Olivieri A, Perego UA, Gómez-Carballa A, Brisighelli F, Lancioni H, Woodward SR, López-Soto M, Carracedo Á, Capelli C, Torroni A, Salas A: Reconstructing ancient mitochondrial DNA links between Africa and Europe. Genome Res 2012, 22(5):821-826.
- [11]Lao O, van Duijn K, Kersbergen P, de Knijff P, Kayser M: Proportioning whole-genome single-nucleotide-polymorphism diversity for the identification of geographic population structure and genetic ancestry. Am J Hum Genet 2006, 78(4):680-690.
- [12]Phillips C, Prieto L, Fondevila M, Salas A, Gómez-Tato A, Álvarez-Dios J, Alonso A, Blanco-Verea A, Brión M, Montesino M, Carracedo A, Lareu MV: Ancestry analysis in the 11-M Madrid bomb attack investigation. PLoS One 2009, 4(8):e6583.
- [13]Phillips C, Salas A, Sánchez JJ, Fondevila M, Gómez-Tato A, Álvarez-Dios J, Calaza M, de Cal MC, Ballard D, Lareu MV, Carracedo A: Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs. Forensic Sci Int Genet 2007, 1(3–4):273-280.
- [14]Salas A, Phillips C, Carracedo A: Ancestry vs physical traits: the search for ancestry informative markers (AIMs). Int J Legal Med 2006, 120(3):188-189.
- [15]Pritchard JK, Rosenberg NA: Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 1999, 65(1):220-228.
- [16]Reich DE, Goldstein DB: Detecting association in a case-control study while correcting for population stratification. Genet Epidemiol 2001, 20(1):4-16.
- [17]Bacanu SA, Devlin B, Roeder K: Association studies for quantitative traits in structured populations. Genet Epidemiol 2002, 22(1):78-93.
- [18]Devlin B, Roeder K, Wasserman L: Genomic control, a new approach to genetic-based association studies. Theor Popul Biol 2001, 60(3):155-166.
- [19]Marchini J, Cardon LR, Phillips MS, Donnelly P: The effects of human population structure on large genetic association studies. Nat Genet 2004, 36(5):512-517.
- [20]Sánchez JJ, Borsting C, Balogh K, Berger B, Bogus M, Butler JM, Carracedo Á, Court DS, Dixon LA, Filipovic B, Fondevila M, Gill P, Harrison CD, Hohoff C, Huel R, Ludes B, Parson W, Parsons TJ, Petkovski E, Phillips C, Schmitter H, Schneider PM, Vallone PM, Morling N: Forensic typing of autosomal SNPs with a 29 SNP-multiplex–results of a collaborative EDNAP exercise. Forensic Sci Int Genet 2008, 2(3):176-183.
- [21]Sánchez JJ, Phillips C, Børsting C, Balogh K, Bogus M, Fondevila M, Harrison CD, Musgrave-Brown E, Salas A, Syndercombe-Court D, Schneider PM, Carracedo A: A multiplex assay with 52 single nucleotide polymorphisms for human identification. Electrophoresis 2006, 27:13-24.
- [22]Goldsmith L, Jackson L, O’Connor A, Skirton H: Direct-to-consumer genomic testing: systematic review of the literature on user perspectives. Eur J Hum Genet 2012, 20(8):811-816.
- [23]Bandelt H-J, Yao Y-G, Richards MB, Salas A: The brave new era of human genetic testing. Bioessays 2008, 30(11–12):1246-1251.
- [24]Egeland T, Bøvelstad HM, Storvik GO, Salas A: Inferring the most likely geographical origin of mtDNA sequence profiles. Ann Hum Genet 2004, 68(Pt 5):461-471.
- [25]Egeland T, Salas A: Estimating haplotype frequency and coverage of databases. PLoS One 2008, 3(12):e3988.
- [26]Ng PC, Murray SS, Levy S, Venter JC: An agenda for personalized medicine. Nature 2009, 461(7265):724-726.
- [27]Cerezo M, Černý V, Carracedo Á, Salas A: Applications of MALDI-TOF MS to large-scale human mtDNA population-based studies. Electrophoresis 2009, 30(21):3665-3673.
- [28]Ribas G, González-Neira A, Salas A, Milne RL, Vega A, Carracedo B, González E, Barroso E, Fernández LP, Yankilevich P, Robledo M, Carracedo A, Benitez J: Evaluating HapMap SNP data transferability in a large-scale genotyping project involving 175 cancer-associated genes. Hum Genet 2006, 118(6):669-679.
- [29]Salas A, Jaime JC, Álvarez-Iglesias V, Carracedo Á: Gender bias in the multi-ethnic genetic composition of Central Argentina. J Hum Genet 2008, 53:662-674.
- [30]Heinz T, Alvarez-Iglesias V, Taboada-Echalar P, Gomez-Carballa A, Torres-Balanza A, Rocabado O, Carracedo A, Vullo C, Salas A: Ancestry analysis reveals a predominant Native American component with moderate European admixture in Bolivians. Forensic Sci Int Genet 2013, 7(2013):537-542. in press
- [31]Alexander DH, Novembre J, Lange K: Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009, 19(9):1655-1664.
- [32]Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007, 81(3):559-575.
- [33]Shriver MD, Kennedy GC, Parra EJ, Lawson HA, Sonpar V, Huang J, Akey JM, Jones KW: The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs. Hum Genomics 2004, 1(4):274-286. BioMed Central Full Text
- [34]Amigo J, Salas A, Phillips C, Carracedo A: SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access. BMC Bioinformatics 2008, 9:6. BioMed Central Full Text
- [35]Amigo J, Salas A, Phillips C: ENGINES: exploring single nucleotide variation in entire human genomes. BMC Bioinformatics 2011, 12:6. BioMed Central Full Text
- [36]Corach D, Lao O, Bobillo C, van Der Gaag K, Zuniga S, Vermeulen M, van Duijn K, Goedbloed M, Vallone PM, Parson W, de Knijff P, Kayser M: Inferring continental ancestry of argentineans from Autosomal, Y-chromosomal and mitochondrial DNA. Ann Hum Genet 2010, 74(1):65-76.
- [37]Halder I, Shriver M, Thomas M, Fernandez JR, Frudakis T: A panel of ancestry informative markers for estimating individual biogeographical ancestry and admixture from four continents: utility and applications. Hum Mutat 2008, 29(5):648-658.
- [38]Kosoy R, Nassir R, Tian C, White PA, Butler LM, Silva G, Kittles R, Alarcon-Riquelme ME, Gregersen PK, Belmont JW, De La Vega FM, Seldin MF: Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America. Hum Mutat 2009, 30(1):69-78.
- [39]Nassir R, Kosoy R, Tian C, White PA, Butler LM, Silva G, Kittles R, Alarcon-Riquelme ME, Gregersen PK, Belmont JW, De La Vega FM, Seldin MF: An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels. BMC Genet 2009, 10:39.
- [40]Bandelt H-J, Salas A: Current next generation sequencing technology may not meet forensic standards. Forensic Sci Int Genet 2012, 6(1):143-145.
- [41]Sankar P, Cho MK: Genetics. Toward a new vocabulary of human genetic variation. Science 2002, 298(5597):1337-1338.