BMC Genomics | |
Finished sequence and assembly of the DUF1220-rich 1q21 region using a haploid human genome | |
James M Sikela3  Richard K Wilson1  Pui-Yan Kwok2  Tina Graves1  Catherine Chu2  Chin Lin2  Yvonne Y Y Lai2  Angel C Y Mak2  Derek Albracht1  David Astling3  C Michael Dickens3  Veronica B Searles3  Majesta O’Bleness3  | |
[1] The Genome Institute at Washington University School of Medicine, St. Louis, MO 63108, USA;Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA;Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, 12801 E. 17th Avenue, Aurora, CO 80045, USA | |
关键词: Hydatidiform mole; DUF1220 domain; 1q21; | |
Others : 1217211 DOI : 10.1186/1471-2164-15-387 |
|
received in 2013-12-18, accepted in 2014-05-06, 发布年份 2014 | |
【 摘 要 】
Background
Although the reference human genome sequence was declared finished in 2003, some regions of the genome remain incomplete due to their complex architecture. One such region, 1q21.1-q21.2, is of increasing interest due to its relevance to human disease and evolution. Elucidation of the exact variants behind these associations has been hampered by the repetitive nature of the region and its incomplete assembly. This region also contains 238 of the 270 human DUF1220 protein domains, which are implicated in human brain evolution and neurodevelopment. Additionally, examinations of this protein domain have been challenging due to the incomplete 1q21 build. To address these problems, a single-haplotype hydatidiform mole BAC library (CHORI-17) was used to produce the first complete sequence of the 1q21.1-q21.2 region.
Results
We found and addressed several inaccuracies in the GRCh37sequence of the 1q21 region on large and small scales, including genomic rearrangements and inversions, and incorrect gene copy number estimates and assemblies. The DUF1220-encoding NBPF genes required the most corrections, with 3 genes removed, 2 genes reassigned to the 1p11.2 region, 8 genes requiring assembly corrections for DUF1220 domains (~91 DUF1220 domains were misassigned), and multiple instances of nucleotide changes that reassigned the domain to a different DUF1220 subtype. These corrections resulted in an overall increase in DUF1220 copy number, yielding a haploid total of 289 copies. Approximately 20 of these new DUF1220 copies were the result of a segmental duplication from 1q21.2 to 1p11.2 that included two NBPF genes. Interestingly, this duplication may have been the catalyst for the evolutionarily important human lineage-specific chromosome 1 pericentric inversion.
Conclusions
Through the hydatidiform mole genome sequencing effort, the 1q21.1-q21.2 region is complete and misassemblies involving inter- and intra-region duplications have been resolved. The availability of this single haploid sequence path will aid in the investigation of many genetic diseases linked to 1q21, including several associated with DUF1220 copy number variations. Finally, the corrected sequence identified a recent segmental duplication that added 20 additional DUF1220 copies to the human genome, and may have facilitated the chromosome 1 pericentric inversion that is among the most notable human-specific genomic landmarks.
【 授权许可】
2014 O’Bleness et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150705092810680.pdf | 3766KB | download | |
Figure 6. | 143KB | Image | download |
Figure 5. | 262KB | Image | download |
Figure 4. | 58KB | Image | download |
Figure 3. | 84KB | Image | download |
Figure 2. | 102KB | Image | download |
Figure 1. | 269KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
【 参考文献 】
- [1]Treangen TJ, Salzberg SL: Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet 2011, 13:34-46.
- [2]Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH: Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 2010, 11:446-450.
- [3]Davis JM, Searles VB, Anderson N, Keeney J, Dumas L, Sikela JM: DUF1220 dosage is linearly associated with increasing severity of the three primary symptoms of autism. PLoS Genet 2014, 10(3):e1004241. doi:10.1371/journal.pgen.1004241
- [4]Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R, Conroy J, Magalhaes TR, Correia C, Abrahams BS, Almeida J, Bacchelli E, Bader GD, Bailey AJ, Baird G, Battaglia A, Berney T, Bolshakova N, Bölte S, Bolton PF, Bourgeron T, Brennan S, Brian J, Bryson SE, Carson AR, Casallo G, Casey J, Chung BH, Cochrane L, Corsello C, et al.: Functional impact of global rare copy number variation in autism spectrum disorders. Nature 2010, 466:368-372.
- [5]International Schizophrenia Consortium: Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 2008, 455:178-179.
- [6]Levinson DF, Duan J, Oh S, Wang K, Sanders AR, Shi J, Zhang N, Mowry BJ, Olincy A, Amin F, Cloninger CR, Silverman JM, Buccola NG, Byerley WF, Black DW, Kendler KS, Freedman R, Dudbridge F, Pe’er I, Hakonarson H, Bergen SE, Fanous AH, Holmans PA, Gejman PV: Copy number variants in schizophrenia: confirmation of five previous findings and new evidence for 3q29 microdeletions and VIPR2 duplications. Am J Psychiatry 2011, 168:302-316.
- [7]Brunetti-Pierri N, Berg JS, Scaglia F, Belmont J, Bacino CA, Sahoo T, Lalani SR, Graham B, Lee B, Shinawi M, Shen J, Kang SH, Pursley A, Lotze T, Kennedy G, Lansky-Shafer S, Weaver C, Roeder ER, Grebe TA, Arnold GL, Hutchison T, Reimschisel T, Amato S, Geragthy MT, Innis JW, Obersztyn E, Nowakowska B, Rosengren SS, Bader PI, Grange DK, et al.: Recurrent reciprocal 1q21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities. Nat Genet 2008, 40:1466-1471.
- [8]Mefford HC, Sharp AJ, Baker C, Itsara A, Jiang Z, Buysse K, Huang S, Maloney VK, Crolla JA, Baralle D, Collins A, Mercer C, Norga K, de Ravel T, Devriendt K, Bongers EM, de Leeuw N, Reardon W, Gimelli S, Bena F, Hennekam RC, Male A, Gaunt L, Clayton-Smith J, Simonic I, Park SM, Mehta SG, Nik-Zainal S, Woods CG, Firth HV, et al.: Recurrent rearrangements of chromosome 1q21.1 and variable pediatric phenotypes. N Engl J Med 2008, 359:1685-1699.
- [9]Christiansen J, Dyck JD, Elyas BG, Lilley M, Bamforth JS, Hicks M, Sprysak KA, Tomaszewski R, Haase SM, Vicen-Wyhony LM, Somerville MJ: Chromosome 1q21.1 contiguous gene deletion is associated with congenital heart disease. Circ Res 2004, 94:1401-1402.
- [10]Greenway SC, Pereira AC, Lin JC, DePalma SR, Israel SJ, Mesquita SM, Ergul E, Conta JH, Korn JM, McCarroll SA, Gorham JM, Gabriel S, Altshuler DM, Quintanilla-Dieck Mde L, Artunduaga MA, Eavey RD, Plenge RM, Shadick NA, Weinblatt ME, De Jager PL, Hafler DA, Breitbart RE, Seidman JG, Seidman CE: De novo copy number variants identify new genes and loci in isolated sporadic tetralogy of Fallot. Nat Genet 2009, 41:931-935.
- [11]Klopocki E, Schulze H, Strauss G, Ott CE, Hall J, Trotier F, Fleischhauer S, Greenhalgh L, Newbury-Ecob RA, Neumann LM, Habenicht R, König R, Seemanova E, Megarbane A, Ropers HH, Ullmann R, Horn D, Mundlos S: Complex inheritance pattern resembling autosomal recessive inheritance involving a microdeletion in thrombocytopenia-absent radius syndrome. Am J Hum Genet 2007, 80:232-240.
- [12]Ledig S, Schippert C, Strick R, Beckmann MW, Oppelt PG, Wieacker P: Recurrent aberrations identified by array-CGH in patients with Mayer-Rokitansky-Küster-Hauser syndrome. Fertil Steril 2011, 95:1589-1594.
- [13]Weber S, Landwehr C, Renkert M, Hoischen A, Wühl E, Denecke J, Radlwimmer B, Haffner D, Schaefer F, Weber RG: Mapping candidate regions and genes for congenital anomalies of the kidneys and urinary tract (CAKUT) by array-based comparative genomic hybridization. Nephrol Dial Transplant 2011, 26:136-143.
- [14]Fortna A, Kim Y, MacLaren E, Marshall K, Hahn G, Meltesen L, Brenton M, Hink R, Burgers S, Hernandez-Boussard T, Karimpour-Fard A, Glueck D, McGavran L, Berry R, Pollack J, Sikela JM: Lineage-specific gene duplication and loss in human and great ape evolution. PLoS Biol 2004, 2:E207.
- [15]Popesco MC, Maclaren EJ, Hopkins J, Dumas L, Cox M, Meltesen L, McGavran L, Wyckoff GJ, Sikela JM: Human lineage-specific amplification, selection, and neuronal expression of DUF1220 domains. Science 2006, 313:1304-1307.
- [16]Dumas L, Sikela JM: DUF1220 domains, cognitive disease, and human brain evolution. Cold Spring Harb Symp Quant Biol 2009, 74:375-382.
- [17]Dumas LJ, O’Bleness MS, Davis JM, Dickens CM, Anderson N, Keeney JG, Jackson J, Sikela M, Raznahan A, Giedd J, Rapoport J, Nagamani SS, Erez A, Brunetti-Pierri N, Sugalski R, Lupski JR, Fingerlin T, Cheung SW, Sikela JM: DUF1220-domain copy number implicated in human brain-size pathology and evolution. Am J Hum Genet 2012, 91:444-454.
- [18]Eichler EE: Proposal for Construction a Human Haploid BAC library from Hydatidiform Mole Source Material. White Paper; 2002. url: http://www.genome.gov/Pages/Research/Sequencing/BACLibrary/HydatidiformMoleBAC021203.pdf webcite
- [19]Szamalek J, Goidts V, Cooper D, Hameister H, Kherer-Sawatzki H: Characterization of the human lineage-specific pericentric inversion that distinguishes human chromosome 1 from the homologous chromosomes of the great apes. Hum Genet 2006, 120:126-138.
- [20]O’Bleness MS, Dickens CM, Dumas LJ, Kehrer-Sawatzki H, Wyckoff GJ, Sikela JM: Evolutionary history and genome organization of DUF1220 protein domains. G3 (Bethesda) 2012, 2:977-986.
- [21]Dennis MY, Nuttle X, Sudmant PH, Antonacci F, Graves TA, Nefedov M, Rosenfeld JA, Sajjadian S, Malig M, Kotkiewicz H, Curry CJ, Shafer S, Shaffer LG, de Jong PJ, Wilson RK, Eichler EE: Evolution of human-specific neural SRGAP2 genes by incomplete segmental duplication. Cell 2012, 149(4):912-922.
- [22]Osoegawa K, Woon PY, Zhao B, Frengen E, Tateno M, Catanese JJ, de Jong PJ: An improved approach for construction of bacterial artificial chromosome libraries. Genomics 1998, 52(1):1-8.
- [23]Huang X, Wang J, Aluru S, Yang SP, Hillier L: PCAP: a whole-genome assembly program. Genome Res 2003, 13(9):2164-2170.
- [24]Slater G, Birney E: Automated generation of heuristics for biological sequence comparison. BMC Bioinforma 2005, 6:31. BioMed Central Full Text
- [25]Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: an information aesthetic for comparative genomics. Genome Res 2009, 19(9):1639-1645.
- [26]Löytynoja A, Goldman N: An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci 2005, 102(30):10557-10562.
- [27]Paradis E, Claude J, Strimmer K: APE: analyses of phylogenetics and evolution in R language. Bioinformatics 2004, 20:289-290.
- [28]Hindson BJ, Ness KD, Masquelier DA, Belgrader P, Heredia NJ, Makarewicz AJ, Bright IJ, Lucero MY, Hiddeson AL, Legler TC, Kitano TK, Hodel MR, Petersen JF, Wyatt PW, Steenblock ER, Shah PH, Bousse LJ, Troup CB, Mellen JC, Wittman DK, Erndt NG, Cauley TH, Koehler RT, So AP, Dube S, Rose KA, Montesclaros L, Wang S, Stumbo DP, Hodges SP, et al.: High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem 2011, 83(22):8604-8610.