BMC Medical Genomics | |
On the identification of potential regulatory variants within genome wide association candidate SNP sets | |
Wyeth W Wasserman1  Chao A Hsiung2  I-Shou Chang4  Chih-yu Chen3  | |
[1] Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada;Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Taiwan;Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia, Canada;National Institute of Cancer Research, National Health Research Institutes, Zhunan, Taiwan | |
关键词: Topological domains; Enhancer; Transcription factor binding site alteration; Gene regulation; Regulatory regions; Lung cancer; GWAS; | |
Others : 796977 DOI : 10.1186/1755-8794-7-34 |
|
received in 2014-01-22, accepted in 2014-06-02, 发布年份 2014 | |
【 摘 要 】
Background
Genome wide association studies (GWAS) are a population-scale approach to the identification of segments of the genome in which genetic variations may contribute to disease risk. Current methods focus on the discovery of single nucleotide polymorphisms (SNPs) associated with disease traits. As there are many SNPs within identified risk loci, and the majority of these are situated within non-coding regions, a key challenge is to identify and prioritize variants affecting regulatory sequences that are likely to contribute to the phenotype assessed.
Methods
We focused investigation on SNPs within lung and breast cancer GWAS loci that reached genome-wide significance for potential roles in gene regulation with a specific focus on SNPs likely to disrupt transcription factor binding sites. Within risk loci, the regulatory potential of sub-regions was classified using relevant open chromatin and epigenetic high throughput sequencing data sets from the ENCODE project in available cancer and normal cell lines. Furthermore, transcription factor affinity altering variants were predicted by comparison of position weight matrix scores between disease and reference alleles. Lastly, ChIP-seq data of transcription associated factors and topological domains were included as binding evidence and potential gene target inference.
Results
The sets of SNPs, including both the disease-associated markers and those in high linkage disequilibrium with them, were significantly over-represented in regulatory sequences of cancer and/or normal cells; however, over-representation was generally not restricted to disease-relevant tissue specific regions. The calculated regulatory potential, allelic binding affinity scores and ChIP-seq binding evidence were the three criteria used to prioritize candidates. Fitting all three criteria, we highlighted breast cancer susceptibility SNPs and a borderline lung cancer relevant SNP located in cancer-specific enhancers overlapping multiple distinct transcription associated factor ChIP-seq binding sites.
Conclusion
Incorporating high throughput sequencing epigenetic and transcription factor data sets from both cancer and normal cells into cancer genetic studies reveals potential functional SNPs and informs subsequent characterization efforts.
【 授权许可】
2014 Chen et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140706022904206.pdf | 2878KB | download | |
Figure 7. | 81KB | Image | download |
Figure 6. | 120KB | Image | download |
Figure 5. | 34KB | Image | download |
Figure 4. | 80KB | Image | download |
Figure 3. | 122KB | Image | download |
Figure 2. | 63KB | Image | download |
Figure 1. | 57KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
【 参考文献 】
- [1]Li MJ, Wang P, Liu X, Lim EL, Wang Z, Yeager M, Wong MP, Sham PC, Chanock SJ, Wang J: GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res 2012, 40:D1047-D1054.
- [2]Rosenbloom KR, Dreszer TR, Long JC, Malladi VS, Sloan CA, Raney BJ, Cline MS, Karolchik D, Barber GP, Clawson H, Diekhans M, Fujita PA, Goldman M, Gravell RC, Harte RA, Hinrichs AS, Kirkup VM, Kuhn RM, Learned K, Maddren M, Meyer LR, Pohl A, Rhead B, Wong MC, Zweig AS, Haussler D, Kent WJ: ENCODE whole-genome data in the UCSC Genome Browser: update 2012. Nucleic Acids Res 2012, 40:D912-917.
- [3]Chen CY, Morris Q, Mitchell JA: Enhancer identification in mouse embryonic stem cells using integrative modeling of chromatin and genomic features. BMC Genomics 2012, 13:152. BioMed Central Full Text
- [4]Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 2007, 39:311-318.
- [5]Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA, Boyer LA, Young RA, Jaenisch R: Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A 2010, 107:21931-21936.
- [6]Akhtar-Zaidi B, Cowper-Sal-lari R, Corradin O, Saiakhova A, Bartels CF, Balasubramanian D, Myeroff L, Lutterbaugh J, Jarrar A, Kalady MF, Willis J, Moore JH, Tesar PJ, Laframboise T, Markowitz S, Lupien M, Scacheri PC: Epigenomic enhancer profiling defines a signature of colon cancer. Science 2012, 336:736-739.
- [7]Wasserman NF, Aneas I, Nobrega MA: An 8q24 gene desert variant associated with prostate cancer risk confers differential in vivo activity to a MYC enhancer. Genome Res 2010, 20:1191-1197.
- [8]Zhang X, Cowper-Sal lari R, Bailey SD, Moore JH, Lupien M: Integrative functional genomics identifies an enhancer looping to the SOX9 gene disrupted by the 17q24.3 prostate cancer risk locus. Genome Res 2012, 22:1437-1446.
- [9]Gerasimova A, Chavez L, Li B, Seumois G, Greenbaum J, Rao A, Vijayanand P, Peters B: Predicting cell types and genetic variations contributing to disease by combining GWAS and epigenetic data. PLoS One 2013, 8:e54359.
- [10]Paul DS, Albers CA, Rendon A, Voss K, Stephens J, van der Harst P, Chambers JC, Soranzo N, Ouwehand WH, Deloukas P: Maps of open chromatin highlight cell type-restricted patterns of regulatory sequence variation at hematological trait loci. Genome Res 2013, 23:1130-1141.
- [11]Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M: Linking disease associations with regulatory information in the human genome. Genome Res 2012, 22:1748-1759.
- [12]Worsley-Hunt R, Bernard V, Wasserman WW: Identification of cis-regulatory sequence variations in individual genome sequences. Genome Med 2011, 3:65. BioMed Central Full Text
- [13]Andersen MC, Engstrom PG, Lithwick S, Arenillas D, Eriksson P, Lenhard B, Wasserman WW, Odeberg J: In silico detection of sequence variations modifying transcriptional regulation. PLoS Comput Biol 2008, 4:e5.
- [14]Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, et al.: Architecture of the human regulatory network derived from ENCODE data. Nature 2012, 489:91-100.
- [15]Spivakov M, Akhtar J, Kheradpour P, Beal K, Girardot C, Koscielny G, Herrero J, Kellis M, Furlong EE, Birney E: Analysis of variation at transcription factor binding sites in Drosophila and humans. Genome Biol 2012, 13:R49. BioMed Central Full Text
- [16]Ward LD, Kellis M: HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res 2012, 40:D930-D934.
- [17]Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, Cherry JM, Snyder M: Annotation of functional variation in personal genomes using RegulomeDB. Genome Res 2012, 22:1790-1797.
- [18]Rhie SK, Coetzee SG, Noushmehr H, Yan C, Kim JM, Haiman CA, Coetzee GA: Comprehensive functional annotation of seventy-one breast cancer risk Loci. PLoS One 2013, 8:e63925.
- [19]Coetzee SG, Rhie SK, Berman BP, Coetzee GA, Noushmehr H: FunciSNP: an R/bioconductor tool integrating functional non-coding data sets with genetic association studies to identify candidate regulatory SNPs. Nucleic Acids Res 2012, 40:e139.
- [20]Barenboim M, Manke T: ChroMoS: an integrated web tool for SNP classification, prioritization and functional interpretation. Bioinformatics 2013, 29:2197-2198.
- [21]Landi MT, Chatterjee N, Yu K, Goldin LR, Goldstein AM, Rotunno M, Mirabello L, Jacobs K, Wheeler W, Yeager M, Bergen AW, Li Q, Consonni D, Pesatori AC, Wacholder S, Thun M, Diver R, Oken M, Virtamo J, Albanes D, Wang Z, Burdette L, Doheny KF, Pugh EW, Laurie C, Brennan P, Hung R, Gaborieau V, McKay JD, Lathrop M, et al.: A genome-wide association study of lung cancer identifies a region of chromosome 5p15 associated with risk for adenocarcinoma. Am J Hum Genet 2009, 85:679-691.
- [22]R: A language and environment for statistical computing [http://www.R-project.org webcite]
- [23]Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5:R80. BioMed Central Full Text
- [24]A catalog of published genome-wide association studies [http://www.genome.gov/gwastudies webcite]
- [25]Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O’Donnell CJ, de Bakker PI: SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 2008, 24:2938-2939.
- [26]Teng L, Firpi HA, Tan K: Enhancers in embryonic stem cells are enriched for transposable elements and genetic variations associated with cancers. Nucleic Acids Res 2011, 39:7371-7379.
- [27]Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 2003, 100:9440-9445.
- [28]Wasserman WW, Sandelin A: Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004, 5:276-287.
- [29]Lenhard B, Wasserman WW: TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics 2002, 18:1135-1136.
- [30]Portales-Casamar E, Thongjuea S, Kwon AT, Arenillas D, Zhao X, Valen E, Yusuf D, Lenhard B, Wasserman WW, Sandelin A: JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Res 2010, 38:D105-D110.
- [31]Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 2012, 485:376-380.
- [32]Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, Diekhans M, Furey TS, Harte RA, Hsu F, Hillman-Jackson J, Kuhn RM, Pedersen JS, Pohl A, Raney BJ, Rosenbloom KR, Siepel A, Smith KE, Sugnet CW, Sultan-Qurraie A, Thomas DJ, Trumbower H, Weber RJ, Weirauch M, Zweig AS, Haussler D, Kent WJ: The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 2006, 34:D590-598.
- [33]Servant N, Lajoie BR, Nora EP, Giorgetti L, Chen CJ, Heard E, Dekker J, Barillot E: HiTC: exploration of high-throughput ‘C’ experiments. Bioinformatics 2012, 28:2843-2844.
- [34]Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z, Lee LK, Stuart RK, Ching CW, Ching KA, Antosiewicz-Bourget JE, Liu H, Zhang X, Green RD, Lobanenkov VV, Stewart R, Thomson JA, Crawford GE, Kellis M, Ren B: Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 2009, 459:108-112.
- [35]Rowan S, Siggers T, Lachke SA, Yue Y, Bulyk ML, Maas RL: Precise temporal control of the eye regulatory gene Pax6 via enhancer-binding site affinity. Genes Dev 2010, 24:980-985.
- [36]Edwards SL, Beesley J, French JD, Dunning AM: Beyond GWASs: illuminating the dark road from association to function. Am J Hum Genet 2013, 93:779-797.
- [37]Michailidou K, Hall P, Gonzalez-Neira A, Ghoussaini M, Dennis J, Milne RL, Schmidt MK, Chang-Claude J, Bojesen SE, Bolla MK, Wang Q, Dicks E, Lee A, Turnbull C, Rahman N, Fletcher O, Peto J, Gibson L, Dos Santos Silva I, Nevanlinna H, Muranen TA, Aittomaki K, Blomqvist C, Czene K, Irwanto A, Liu J, Waisfisz Q, Meijers-Heijboer H, Adank M, van der Luijt RB: Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 2013, 45:353-361. 361e351-352
- [38]Visel A, Blow MJ, Li Z, Zhang T, Akiyama JA, Holt A, Plajzer-Frick I, Shoukry M, Wright C, Chen F, Afzal V, Ren B, Rubin EM, Pennacchio LA: ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 2009, 457:854-858.
- [39]Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, Ntini E, Arner E, Valen E, Li K, Schwarzfischer L, Glatz D, Raithel J, Lilje B, Rapin N, Bagger FO, Jorgensen M, Andersen PR, Bertin N, Rackham O, Burroughs AM, Baillie JK, Ishizu Y, Shimizu Y, Furuhata E, Maeda S, et al.: An atlas of active enhancers across human cell types and tissues. Nature 2014, 507:455-461.
- [40]Nora EP, Lajoie BR, Schulz EG, Giorgetti L, Okamoto I, Servant N, Piolot T, van Berkum NL, Meisig J, Sedat J, Gribnau J, Barillot E, Bluthgen N, Dekker J, Heard E: Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 2012, 485:381-385.
- [41]Berlivet S, Paquette D, Dumouchel A, Langlais D, Dostie J, Kmita M: Clustering of tissue-specific Sub-TADs accompanies the regulation of HoxA genes in developing limbs. PLoS Genet 2013, 9:e1004018.
- [42]Ghoussaini M, Fletcher O, Michailidou K, Turnbull C, Schmidt MK, Dicks E, Dennis J, Wang Q, Humphreys MK, Luccarini C, Baynes C, Conroy D, Maranian M, Ahmed S, Driver K, Johnson N, Orr N, Dos Santos Silva I, Waisfisz Q, Meijers-Heijboer H, Uitterlinden AG, Rivadeneira F, Hall P, Czene K, Irwanto A, Liu J, Nevanlinna H, Aittomaki K, Blomqvist C, Meindl A, et al.: Genome-wide association analysis identifies three new breast cancer susceptibility loci. Nat Genet 2012, 44:312-318.
- [43]Park YY, Kim K, Kim SB, Hennessy BT, Kim SM, Park ES, Lim JY, Li J, Lu Y, Gonzalez-Angulo AM, Jeong W, Mills GB, Safe S, Lee JS: Reconstruction of nuclear receptor network reveals that NR2E3 is a novel upstream regulator of ESR1 in breast cancer. EMBO Mol Med 2012, 4:52-67.
- [44]Zhang S, Chen L, Cui B, Chuang HY, Yu J, Wang-Rodriguez J, Tang L, Chen G, Basak GW, Kipps TJ: ROR1 is expressed in human breast cancer and associated with enhanced tumor-cell growth. PLoS One 2012, 7:e31127.
- [45]Uhrmacher S, Schmidt C, Erdfelder F, Poll-Wolbeck SJ, Gehrke I, Hallek M, Kreuzer KA: Use of the receptor tyrosine kinase-like orphan receptor 1 (ROR1) as a diagnostic tool in chronic lymphocytic leukemia (CLL). Leuk Res 2011, 35:1360-1366.
- [46]Zhi Q, Guo X, Guo L, Zhang R, Jiang J, Ji J, Zhang J, Zhang J, Chen X, Cai Q, Li J, Liu B, Zhu Z, Yu Y: Oncogenic miR-544 is an important molecular target in gastric cancer. Anticancer Agents Med Chem 2013, 13:270-275.
- [47]Thayanithy V, Sarver AL, Kartha RV, Li L, Angstadt AY, Breen M, Steer CJ, Modiano JF, Subramanian S: Perturbation of 14q32 miRNAs-cMYC gene network in osteosarcoma. Bone 2012, 50:171-181.
- [48]Luo Z, Zhang L, Li Z, Li X, Li G, Yu H, Jiang C, Dai Y, Guo X, Xiang J, Li G: An in silico analysis of dynamic changes in microRNA expression profiles in stepwise development of nasopharyngeal carcinoma. BMC Med Genomics 2012, 5:3. BioMed Central Full Text
- [49]Nesbit CE, Tersak JM, Prochownik EV: MYC oncogenes and human neoplastic disease. Oncogene 1999, 18:3004-3016.
- [50]Eisenman RN: Deconstructing myc. Genes Dev 2001, 15:2023-2030.
- [51]Li L, Davie JR: The role of Sp1 and Sp3 in normal and cancer cell biology. Ann Anat 2010, 192:275-283.
- [52]Mathelier A, Zhao X, Zhang AW, Parcy F, Worsley-Hunt R, Arenillas DJ, Buchman S, Chen CY, Chou A, Ienasescu H, Lim J, Shyr C, Tan G, Zhou M, Lenhard B, Sandelin A, Wasserman WW: JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res 2014, 42:D142-147.
- [53]Kulakovskiy IV, Medvedeva YA, Schaefer U, Kasianov AS, Vorontsov IE, Bajic VB, Makeev VJ: HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res 2013, 41:D195-D202.
- [54]Hoffman MM, Buske OJ, Wang J, Weng Z, Bilmes JA, Noble WS: Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat Methods 2012, 9:473-476.
- [55]Ernst J, Kellis M: ChromHMM: automating chromatin-state discovery and characterization. Nat Methods 2012, 9:215-216.
- [56]Benos PV, Bulyk ML, Stormo GD: Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res 2002, 30:4442-4451.