BMC Genomics | |
Interlocus gene conversion explains at least 2.7 % of single nucleotide variants in human segmental duplications | |
Beth L. Dumont1  | |
[1] Initiative in Biological Complexity, North Carolina State University, 112 Derieux Place, 3510 Thomas Hall, Raleigh 27695-7614, NC, USA | |
关键词: Gene duplication; Segmental duplication; Recombination; 1000 Genomes; Global pairwise alignment; Polymorphism; Gene conversion; | |
Others : 1216238 DOI : 10.1186/s12864-015-1681-3 |
|
received in 2015-02-24, accepted in 2015-06-01, 发布年份 2015 | |
【 摘 要 】
Background
Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Although IGC is a well-established mechanism of human disease, the extent to which this mutagenic process has shaped overall patterns of segregating variation in multi-copy regions of the human genome remains unknown. One expected manifestation of IGC in population genomic data is the presence of one-to-one paralogous SNPs that segregate identical alleles.
Results
Here, I use SNP genotype calls from the low-coverage phase 3 release of the 1000 Genomes Project to identify 15,790 parallel, shared SNPs in duplicated regions of the human genome. My approach for identifying these sites accounts for the potential redundancy of short read mapping in multi-copy genomic regions, thereby effectively eliminating false positive SNP calls arising from paralogous sequence variation. I demonstrate that independent mutation events to identical nucleotides at paralogous sites are not a significant source of shared polymorphisms in the human genome, consistent with the interpretation that these sites are the outcome of historical IGC events. These putative signals of IGC are enriched in genomic contexts previously associated with non-allelic homologous recombination, including clear signals in gene families that form tandem intra-chromosomal clusters.
Conclusions
Taken together, my analyses implicate IGC, not point mutation, as the mechanism generating at least 2.7 % of single nucleotide variants in duplicated regions of the human genome.
【 授权许可】
2015 Dumont.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150629080336627.pdf | 962KB | download | |
Fig. 3. | 48KB | Image | download |
Fig. 2. | 24KB | Image | download |
Fig. 1. | 43KB | Image | download |
【 图 表 】
Fig. 1.
Fig. 2.
Fig. 3.
【 参考文献 】
- [1]Bailey JA, Eichler EE. Primate segmental duplications: crucibles of evolution, diversity and disease. Nat Rev Genet. 2006; 7:552-564.
- [2]Samonte RV, Eichler EE. Segmental duplications and the evolution of the primate genome. Nat Rev Genet. 2002; 3:65-72.
- [3]Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental duplications: Organization and impact within the current human genome project assembly. Genome Res. 2001; 11:1005-1017.
- [4]Stankiewicz P, Lupski JR. Genome architecture, rearrangements and genomic disorders. Trends Genet. 2002;74–82.
- [5]Sharp AJ, Hansen S, Selzer RR, Cheng Z, Regan R, Hurst JA, Stewart H, Price SM, Blair E, Hennekam RC, Fitzpatrick CA, Segraves R, Richmond TA, Guiver C, Albertson DG, Pinkel D, Eis PS, Schwartz S, Knight SJL, Eichler EE. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat Genet. 2006; 38:1038-1042.
- [6]Shaw CJ, Lupski JR. Implications of human genome architecture for rearrangement-based disorders: the genomic basis of disease. Hum Mol Genet. 2004; 13 Spec No:R57-R64.
- [7]Dennis MY, Nuttle X, Sudmant PH, Antonacci F, Graves TA, Nefedov M, Rosenfeld JA, Sajjadian S, Malig M, Kotkiewicz H, Curry CJ, Shafer S, Shaffer LG, De Jong PJ, Wilson RK, Eichler EE. Evolution of human-specific neural SRGAP2 genes by incomplete segmental duplication. Cell. 2012; 149:912-922.
- [8]Dumas L, Kim YH, Karimpour-Fard A, Cox M, Hopkins J, Pollack JR, Sikela JM. Gene copy number variation spanning 60 million years of human and primate evolution. Genome Res. 2007; 17:1266-1277.
- [9]Ciccarelli FD, von Mering C, Suyama M, Harrington ED, Izaurralde E, Bork P. Complex genomic rearrangements lead to novel primate gene function. Genome Res. 2005; 15:343-351.
- [10]Johnson ME, Viggiano L, Bailey JA, Abdul-Rauf M, Goodwin G, Rocchi M, Eichler EE. Positive selection of a gene family during the emergence of humans and African apes. Nature. 2001; 413:514-519.
- [11]Teshima KM, Innan H. The effect of gene conversion on the divergence between duplicated genes. Genetics. 2004; 166:1553-1560.
- [12]Bettencourt BR, Feder ME. Rapid concerted evolution via gene conversion at the Drosophila hsp70 genes. J Mol Evol. 2002; 54:569-586.
- [13]Arnheim N, Krystal M, Schmickel R, Wilson G, Ryder O, Zimmer E. Molecular evidence for genetic exchanges among ribosomal genes on nonhomologous chromosomes in man and apes. Proc Natl Acad Sci U S A. 1980; 77:7323-7327.
- [14]Nehrt NL, Clark WT, Radivojac P, Hahn MW. Testing the ortholog conjecture with comparative functional genomic data from mammals. PLoS Comput Biol. 2011;7.
- [15]Ohta T. Role of diversifying selection and gene conversion in evolution of major histocompatibility complex loci. Proc Natl Acad Sci U S A. 1991; 88:6716-6720.
- [16]Takuno S, Nishio T, Satta Y, Innan H. Preservation of a pseudogene by gene conversion and diversifying selection. Genetics. 2008; 180:517-531.
- [17]Teshima KM, Innan H. Neofunctionalization of duplicated genes under the pressure of gene conversion. Genetics. 2008; 178:1385-1398.
- [18]Fawcett JA, Innan H. Neutral and non-neutral evolution of duplicated genes with gene conversion. Genes (Basel). 2011; 2:191-209.
- [19]Bischof JM, Chiang AP, Scheetz TE, Stone EM, Casavant TL, Sheffield VC, Braun TA. Genome-wide identification of pseudogenes capable of disease-causing gene conversion. Hum Mutat. 2006; 27:545-552.
- [20]Chen J-M, Cooper DN, Chuzhanova N, Férec C, Patrinos GP. Gene conversion: mechanisms, evolution and human disease. Nat Rev Genet. 2007; 8:762-775.
- [21]Casola C, Zekonyte U, Phillips AD, Cooper DN, Hahn MW. Interlocus gene conversion events introduce deleterious mutations into at least 1 % of human genes associated with inherited disease. Genome Res. 2012; 22:429-435.
- [22]Sharon D, Glusman G, Pilpel Y, Khen M, Gruetzner F, Haaf T, Lancet D. Primate evolution of an olfactory receptor cluster: diversification by gene conversion and recent emergence of pseudogenes. Genomics. 1999; 61:24-36.
- [23]Zangenberg G, Huang M-M, Arnheim N, Erlich H. New HLA-DPB1 alleles generated by interallelic gene conversion detected by analysis of sperm. Nat Genet. 1995; 10:407-414.
- [24]Innan H. A two-locus gene conversion model with selection and its application to the human RHCE and RHD genes. Proc Natl Acad Sci. 2003; 100(15):8793-8798.
- [25]Benovoy D, Drouin G. Ectopic gene conversions in the human genome. Genomics. 2009; 93:27-32.
- [26]McGrath CL, Casola C, Hahn MW. Minimal effect of ectopic gene conversion among recent duplicates in four mammalian genomes. Genetics. 2009; 182:615-622.
- [27]Jackson MS, Oliver K, Loveland J, Humphray S, Dunham I, Rocchi M, Viggiano L, Park JP, Hurles ME, Santibanez-Koref M. Evidence for widespread reticulate evolution within human duplicons. Am J Hum Genet. 2014; 77:824-840.
- [28]Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One. 2013; 8:e75949.
- [29]Keinan A, Clark AG. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science. 2012;740–743.
- [30]Mansai SP, Innan H. The Power of the Methods for Detecting Interlocus Gene Conversion. Genet. 2010; 184(2):517-527.
- [31]Stephens JC: Statistical methods of DNA sequence analysis: Detection of intragenic recombination or gene conversion. Mol Biol Evol. 1985;2:539–556.
- [32]Betran E, Rozas J, Navarro A, Barbadilla A. Estimation of the number and the length distribution of gene conversion tracts from population DNA sequence data. Genetics. 1997; 146:89-99.
- [33]Innan H. A method for estimating the mutation, gene conversion and recombination parameters in small multigene families. Genetics. 2002; 161:865-872.
- [34]Hallast P, Nagirnaja L, Margus T, Laan M. Segmental duplications and gene conversion: Human luteinizing hormone/chorionic gonadotropin beta gene cluster. Genome Res. 2005; 15:1535-46.
- [35]Kitzman JO, Mackenzie AP, Adey A, Hiatt JB, Patwardhan RP, Sudmant PH, Ng SB, Alkan C, Qiu R, Eichler EE, Shendure J. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nat Biotechnol. 2011; 29:59-63.
- [36]Bosch E, Hurles ME, Navarro A, Jobling MA. Dynamics of a human interparalog gene conversion hotspot. Genome Res. 2004; 14:835-844.
- [37]Hurles ME. Gene conversion homogenizes the CMT1A paralogous repeats. BMC Genomics. 2001; 2:11. BioMed Central Full Text
- [38]Ohta T. Allelic and nonallelic homology of a supergene family. Proc Natl Acad Sci U S A. 1982; 79:3251-3254.
- [39]Campbell CD, Chong JX, Malig M, Ko A, Dumont BL, Han L, et al. Estimating the human mutation rate using autozygosity in a founder population. Nat Genet. 2012;1277–1281.
- [40]Nachman MW, Crowell SL. Estimate of the mutation rate per nucleotide in humans. Genetics. 2000; 156:297-304.
- [41]Kimura M. Evolutionary rate at the molecular level. Nature. 1968; 217:624-626.
- [42]Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, Gudjonsson SA, Sigurdsson A, Jonasdottir A, Jonasdottir A, Wong WSW, Sigurdsson G, Walters GB, Steinberg S, Helgason H, Thorleifsson G, Gudbjartsson DF, Helgason A, Magnusson OT, Thorsteinsdottir U, Stefansson K: Rate of de novo mutations and the importance of father’s age to disease risk. Nature 2012;488:471–475.
- [43]Conrad DF, Keebler JEM, DePristo MA, Lindsay SJ, Zhang Y, Casals F, Idaghdour Y, Hartl CL, Torroja C, Garimella KV, Zilversmit M, Cartwright R, Rouleau GA, Daly M, Stone EA, Hurles ME, Awadalla P. Variation in genome-wide mutation rates within and between human families. Nat Genet. 2011; 43:712-714.
- [44]Kondrashov AS. Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases. Hum Mutat. 2003; 21:12-27.
- [45]Lynch M. Rate, molecular spectrum, and consequences of human mutation. Proc Natl Acad Sci U S A. 2010; 107:961-968.
- [46]Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002; 18:337-338.
- [47]Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A, Sampas N, Bruhn L, Shendure J, Eichler EE. Diversity of human copy number variation and multicopy genes. Science. 2010; 330:641-646.
- [48]Avent ND, Liu W, Jones JW, Scott ML, Voak D, Pisacka M, Watt J, Fletcher A. Molecular analysis of Rh transcripts and polypeptides from individuals expressing the DVI variant phenotype: an RHD gene deletion event does not generate All DVIccEe phenotypes. Blood. 1997; 89:1779-1786.
- [49]Kitano T, Saitou N. Evolution of Rh blood group genes have experienced gene conversions and positive selection. J Mol Evol. 1999; 49:615-626.
- [50]Seemann GH, Rein RS, Brown CS, Ploegh HL. Gene conversion-like mechanisms may generate polymorphism in human class I genes. EMBO J. 1986; 5:547-552.
- [51]Gorski J, Mach B. Polymorphism of human Ia antigens: gene conversion between two DR [beta] loci results in a new HLA-D/DR specificity. Nature. 1986; 322:67-70.
- [52]Stankiewicz P, Lupski JR: Molecular-evolutionary mechanisms for genomic disorders. Current Opinion in Genetics and Development 2002;12:312–319.
- [53]Peng Z, Zhou W, Fu W, Du R, Jin L, Zhang F: Correlation between frequency of non-allelic homologous recombination and homology properties: evidence from homology-mediated CNV mutations in the human genome. Hum Mol Genet 2015;24:1225–33.
- [54]Schildkraut E, Miller CA, Nickoloff JA. Gene conversion and deletion frequencies during double-strand break repair in human cells are controlled by the distance between direct repeats. Nucleic Acids Res. 2005; 33:1574-1580.
- [55]Snyder SK, Wessner DH, Wessells JL, Waterhouse RM, Wahl LM, Zimmermann W, Dveksler GS. Pregnancy-specific glycoproteins function as immunomodulators by inducing secretion of IL-10, IL-6 and TGF-beta1 by human monocytes. Am J Reprod Immunol. 2001; 45:205-216.
- [56]Endoh M, Kobayashi Y, Yamakami Y, Yonekura R, Fujii M, Ayusawa D. Coordinate expression of the human pregnancy-specific glycoprotein gene family during induced and replicative senescence. Biogerontology. 2009; 10:213-221.
- [57]Fry AE, Trafford CJ, Kimber MA, Chan M-S, Rockett KA, Kwiatkowski DP. Haplotype homozygosity and derived alleles in the human genome. Am J Hum Genet. 2006; 78:1053-1059.
- [58]Dorus S, Vallender EJ, Evans PD, Anderson JR, Gilbert SL, Mahowald M, Wyckoff GJ, Malcom CM, Lahn BT. Accelerated evolution of nervous system genes in the origin of Homo sapiens. Cell. 2004; 119:1027-1040.
- [59]Alders M, Koopmann TT, Christiaans I, Postema PG, Beekman L, Tanck MWT, Zeppenfeld K, Loh P, Koch KT, Demolombe S, Mannens MMAM, Bezzina CR, Wilde AAM. Haplotype-sharing analysis implicates chromosome 7q36 harboring DPP6 in familial idiopathic ventricular fibrillation. Am J Hum Genet. 2009; 84:468-476.
- [60]Labrie V, Fukumura R, Rastogi A, Fick LJ, Wang W, Boutros PC, Kennedy JL, Semeralul MO, Lee FH, Baker GB, Belsham DD, Barger SW, Gondo Y, Wong AHC, Roder JC. Serine racemase is associated with schizophrenia susceptibility in humans and in a mouse model. Hum Mol Genet. 2009; 18:3227-3243.
- [61]Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G, Dahl F, Fernandez A, Staker B, Pant KP, Baccash J, Borcherding AP, Brownley A, Cedeno R, Chen L, Chernikoff D, Cheung A, Chirita R, Curson B, Ebert JC, Hacker CR, Hartlage R, Hauser B, Huang S, Jiang Y, Karpinchyk V et al.. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science. 2010; 327:78-81.
- [62]Hartasánchez DA, Vallès-Codina O, Brasó-Vives M, Navarro A: Interplay of Interlocus Gene Conversion and Crossover in Segmental Duplications Under a Neutral Scenario. G3 GenesGenomesGenetics. 2014;4:1479–89.
- [63]Lukacsovich T, Waldman AS. Suppression of intrachromosomal gene conversion in mammalian cells by small degrees of sequence divergence. Genetics. 1999; 151:1559-1568.
- [64]Nielsen R, Bustamante C, Clark AG, Glanowski S, Sackton TB, Hubisz MJ, Fledel-Alon A, Tanenbaum DM, Civello D, White TJ, J Sninsky J, Adams MD, Cargill M. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 2005; 3:e170.
- [65]Stahl EA, Bishop JG. Plant-pathogen arms races at the molecular level. Curr Opin Plant Biol. 2000;299–304.
- [66]Begun D, Whitley P, Todd B, Waldrip-Dail H, Clark A. Molecular population genetics of male accessory gland proteins in Drosophila. Genetics. 2000; 156:1879-88.
- [67]Nielsen R. Molecular signatures of natural selection. Annu Rev Genet. 2005; 39:197-218.
- [68]Charlesworth D. Balancing selection and its effects on sequences in nearby genome regions. PLoS Genet. 2006;379–384.
- [69]Galtier N, Duret L, Glémin S, Ranwez V. GC-biased gene conversion promotes the fixation of deleterious amino acid changes in primates. Trends Genet. 2009;1–5.
- [70]Nuttle X, Huddleston J, O’Roak BJ, Antonacci F, Fichera M, Romano C, Shendure J, Eichler EE. Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods. 2013; 10:903-9.
- [71]Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004; 32:D493-D496.
- [72]Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000; 16:276-277.
- [73]Han MV, Demuth JP, McGrath CL, Casola C, Hahn MW. Adaptive evolution of young gene duplicates in mammals. Genome Res. 2009; 19:859-67.
- [74]Altshuler D, Lander E, Ambrogio L. A map of human genome variation from population scale sequencing. Nature. 2010; 476:1061-1073.
- [75]Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25:1754-1760.
- [76]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25:2078-2079.
- [77]Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, Altshuler D. Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 2005; 15(11):1576-1583.
- [78]Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;64–69.
- [79]R Development Core Team R: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria: 2011;409. [R Foundation for Statistical Computing]