期刊论文详细信息
BMC Genetics
Simultaneous SNP identification and assessment of allele-specific bias from ChIP-seq data
Vishwanath R Iyer1  Anna Battenhouse1  Amelia Weber Hall1  Yunyun Ni1 
[1] Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, Section of Molecular Genetics and Microbiology, University of Texas at Austin, Austin, TX 78712, USA
关键词: Allele-specific;    Genotyping;    ChIP-seq;    Transcription factors;    SNPs;   
Others  :  1121368
DOI  :  10.1186/1471-2156-13-46
 received in 2012-05-24, accepted in 2012-09-05,  发布年份 2012
PDF
【 摘 要 】

Background

Single nucleotide polymorphisms (SNPs) have been associated with many aspects of human development and disease, and many non-coding SNPs associated with disease risk are presumed to affect gene regulation. We have previously shown that SNPs within transcription factor binding sites can affect transcription factor binding in an allele-specific and heritable manner. However, such analysis has relied on prior whole-genome genotypes provided by large external projects such as HapMap and the 1000 Genomes Project. This requirement limits the study of allele-specific effects of SNPs in primary patient samples from diseases of interest, where complete genotypes are not readily available.

Results

In this study, we show that we are able to identify SNPs de novo and accurately from ChIP-seq data generated in the ENCODE Project. Our de novo identified SNPs from ChIP-seq data are highly concordant with published genotypes. Independent experimental verification of more than 100 sites estimates our false discovery rate at less than 5%. Analysis of transcription factor binding at de novo identified SNPs revealed widespread heritable allele-specific binding, confirming previous observations. SNPs identified from ChIP-seq datasets were significantly enriched for disease-associated variants, and we identified dozens of allele-specific binding events in non-coding regions that could distinguish between disease and normal haplotypes.

Conclusions

Our approach combines SNP discovery, genotyping and allele-specific analysis, but is selectively focused on functional regulatory elements occupied by transcription factors or epigenetic marks, and will therefore be valuable for identifying the functional regulatory consequences of non-coding SNPs in primary disease samples.

【 授权许可】

   
2012 Ni et al. licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150212013312707.pdf 507KB PDF download
Figure 6. 37KB Image download
Figure 5. 95KB Image download
Figure 4. 58KB Image download
Figure 3. 107KB Image download
Figure 2. 78KB Image download
Figure 1. 74KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

【 参考文献 】
  • [1]Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 2009, 106:9362-9367.
  • [2]Hirschhorn JN, Gajdos ZK: Genome-wide association studies: results from the first few years and potential implications for clinical medicine. Annu Rev Med 2011, 62:11-24.
  • [3]The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature 2007, 449:851-861.
  • [4]The 1000 Genomes Project Consortium: A map of human genome variation from population-scale sequencing. Nature 2010, 467:1061-1073.
  • [5]Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, Carthy MI, Ramos EM, Cardon LR, Chakravarti A, et al.: Finding the missing heritability of complex diseases. Nature 2009, 461:747-753.
  • [6]Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB: Rare variants create synthetic genome-wide associations. PLoS Biol 2010, 8:e1000294. 7
  • [7]McClellan J, King M, McClellan J, King MC: Genetic heterogeneity in human disease. Cell 2010, 141:210-217.
  • [8]Wang K, Dickson SP, Stolle CA, Krantz ID, Goldstein DB, Hakonarson H: Interpretation of association signals and identification of causal variants from genome-wide association studies. Am J Hum Genet 2010, 86:730-742.
  • [9]Birney E, Lieb JD, Furey TS, Crawford GE, Iyer VR: Allele-specific and heritable chromatin signatures in humans. Hum Mol Genet 2010, 19:R204-R209.
  • [10]Cheung VG, Spielman RS: Genetics of human gene expression: mapping DNA variants that influence gene expression. Nat Rev Genet 2009, 10:595-604.
  • [11]Seibold MA, Wise AL, Speer MC, Steele MP, Brown KK, Loyd JE, Fingerlin TE, Zhang W, Gudmundsson G, Groshong SD, et al.: A common MUC5B promoter polymorphism and pulmonary fibrosis. N Engl J Med 2011, 364:1503-1512.
  • [12]Meyer KB, Maia AT, O'Reilly M, Teschendorff AE, Chin SF, Caldas C, Ponde BA: Allele-specific up-regulation of FGFR2 increases susceptibility to breast cancer. PLoS Biol 2008, 6:e108.
  • [13]Meyer KB, Maia AT, O'Reilly M, Ghoussaini M, Prathalingam R, Porter-Gill P, Ambs S, Prokunina-Olsson L, Carroll J, Ponder BA: A functional variant at a prostate cancer predisposition locus at 8q24 is associated with PVT1 expression. PLoS Genet 2011, 7:e1002165.
  • [14]Bond GL, Hu W, Bond EE, Robins H, Lutzker SG, Arva NC, Bargonetti J, Bartel F, Taubert H, Wuerl P, et al.: A single nucleotide polymorphism in the MDM2 promoter attenuates the p53 tumor suppressor pathway and accelerates tumor formation in humans. Cell 2004, 119:591-602.
  • [15]Knappskog S, Bjornslett M, Myklebust LM, Huijts PE, Vreeswijk MP, Edvardsen H, Guo Y, Zhang X, Yang M, Ylisaukko-Oja SK, et al.: The MDM2 promoter SNP285C/309G haplotype diminishes Sp1 transcription factor binding and reduces risk for breast and ovarian cancer in Caucasians. Cancer Cell 2011, 19:273-282.
  • [16]McDaniell R, Lee BK, Song L, Liu Z, Boyle AP, Erdos MR, Scott LJ, Morken MA, Kucera KS, Battenhouse A, et al.: Heritable individual-specific and allelespecific chromatin signatures in humans. Science 2010, 328:235-239.
  • [17]Kasowski M, Grubert F, Heffelfinger C, Hariharan M, Asabere A, Waszak SM, Habegger L, Rozowsky J, Shi M, Urban AE, et al.: Variation in transcription factor binding among humans. Science 2010, 328:232-235.
  • [18]Lee BK, Bhinge AA, Battenhouse A, McDaniell RM, Liu Z, Song L, Ni Y, Birney E, Lieb JD, Furey TS, et al.: Cell-type specific and combinatorial usage of diverse transcription factors revealed by genome-wide binding studies in multiple human cells. Genome Res 2012, 22:9-24.
  • [19]DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al.: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011, 43:491-498.
  • [20]McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The genome analysis toolkit: a map reduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20:1297-1303.
  • [21]Li H, Durbin R: Fast and accurate long-read alignment with Burrows- Wheeler transform. Bioinformatics 2010, 26:589-595.
  • [22]Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 1999, 27:573-580.
  • [23]Pickrell JK, Gaffney DJ, Gilad Y, Pritchard JK: False positive peaks in Ch IP seq and other sequencing-based functional assays caused by unannotated high copy number regions. Bioinformatics 2011, 27:2144-2146.
  • [24]Ebersberger I, Metzler D, Schwarz C, Paabo S: Genomewide comparison of DNA sequences between humans and chimpanzees. Am J Hum Genet 2002, 70:1490-1497.
  • [25]Phillips JE, Corces VG: CTCF: master weaver of the genome. Cell 2009, 137:1194-1211.
  • [26]Miller W, Rosenbloom K, Hardison RC, Hou M, Taylor J, Raney B, Burhans R, King DC, Baertsch R, Blankenberg D, et al.: 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res 2007, 17:1797-1808.
  • [27]Guryev V, Smits BM, van de Belt J, Verheul M, Hubner N, Cuppen E: Haplotypeblock structure is conserved across mammals. PLoS Genet 2006, 2:e121.
  • [28]The International HapMap Consortium: A haplo type map of the human genome. Nature 2005, 437:1299-1320.
  • [29]Ellegren H: The different levels of genetic diversity in sex chromosomesand autosomes. Trends Genet 2009, 25:278-284.
  • [30]Gottipati S, Arbiza L, Siepel A, Clark AG, Keinan A: Analyses of X-linked andautosomal genetic variation in population-scale whole genome sequencing. Nat Genet 2011, 43:741-743.
  • [31]Ernst J, Kellis M: Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol 2010, 28:817-825.
  • [32]Attiyeh EF, Diskin SJ, Attiyeh MA, Mosse YP, Hou C, Jackson EM, Kim C, Glessner J, Hakonarson H, Biegel JA, Maris JM: Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy. Genome Res 2009, 19:276-283.
  • [33]Fan HC, Wang J, Potanina A, Quake SR: Whole-genome molecular haplotyping of single cells. Nat Biotechnol 2011, 29:51-57.
  • [34]Kitzman JO, Mackenzie AP, Adey A, Hiatt JB, Ng SB, Alkan C, Qiu R, Eichler EE, Shendure J: Haplotype-resolved genomesequencing of a Gujarati Indian individual. Nat Biotechnol 2011, 29:59-63.
  • [35]Toung JM, Morley M, Li M, Cheung VG: RNA-sequence analysis of human B-cells. Genome Res 2011, 21:991-998.
  文献评价指标  
  下载次数:26次 浏览次数:8次