| BMC Genomics | |
| De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets | |
| Zhengchang Su1  Ehsan S Tabari1  Meng Niu1  | |
| [1] Department of Bioinformatics and Genomics, College of Computing and Informatics, The University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC 28223, USA | |
| 关键词: Drosophila melanogaster; ChIP-seq; ChIP-chip; cis-regulatory modules; cis-regulatory elements; | |
| Others : 1090141 DOI : 10.1186/1471-2164-15-1047 |
|
| received in 2014-06-26, accepted in 2014-11-19, 发布年份 2014 | |
PDF
|
|
【 摘 要 】
Background
In eukaryotes, transcriptional regulation is usually mediated by interactions of multiple transcription factors (TFs) with their respective specific cis-regulatory elements (CREs) in the so-called cis-regulatory modules (CRMs) in DNA. Although the knowledge of CREs and CRMs in a genome is crucial to elucidate gene regulatory networks and understand many important biological phenomena, little is known about the CREs and CRMs in most eukaryotic genomes due to the difficulty to characterize them by either computational or traditional experimental methods. However, the exponentially increasing number of TF binding location data produced by the recent wide adaptation of chromatin immunoprecipitation coupled with microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) technologies has provided an unprecedented opportunity to identify CRMs and CREs in genomes. Nonetheless, how to effectively mine these large volumes of ChIP data to identify CREs and CRMs at nucleotide resolution is a highly challenging task.
Results
We have developed a novel graph-theoretic based algorithm DePCRM for genome-wide de novo predictions of CREs and CRMs using a large number of ChIP datasets. DePCRM predicts CREs and CRMs by identifying overrepresented combinatorial CRE motif patterns in multiple ChIP datasets in an effective way. When applied to 168 ChIP datasets of 56 TFs from D. melanogaster, DePCRM identified 184 and 746 overrepresented CRE motifs and their combinatorial patterns, respectively, and predicted a total of 115,932 CRMs in the genome. The predictions recover 77.9% of known CRMs in the datasets and 89.3% of known CRMs containing at least one predicted CRE. We found that the putative CRMs as well as CREs as a whole in a CRM are more conserved than randomly selected sequences.
Conclusion
Our results suggest that the CRMs predicted by DePCRM are highly likely to be functional. Our algorithm is the first of its kind for de novo genome-wide prediction of CREs and CRMs using larger number of transcription factor ChIP datasets. The algorithm and predictions will hopefully facilitate the elucidation of gene regulatory networks in eukaryotes. All the predicted CREs, CRMs, and their target genes are available at http://bioinfo.uncc.edu/mniu/pcrms/www/ webcite.
【 授权许可】
2014 Niu et al.; licensee BioMed Central Ltd.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 20150128154437140.pdf | 1839KB | ||
| Figure 9. | 87KB | Image | |
| Figure 8. | 101KB | Image | |
| Figure 7. | 79KB | Image | |
| Figure 6. | 53KB | Image | |
| Figure 5. | 124KB | Image | |
| Figure 4. | 26KB | Image | |
| Figure 3. | 57KB | Image | |
| Figure 2. | 158KB | Image | |
| Figure 1. | 97KB | Image |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
【 参考文献 】
- [1]Consortium CeS: Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 1998, 282(5396):2012-2018.
- [2]Pagani I, Liolios K, Jansson J, Chen IM, Smirnova T, Nosrat B, Markowitz VM, Kyrpides NC: The Genomes OnLine Database (GOLD) v. 4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2012, 40(Database issue):D571-D579.
- [3]Heard E, Tishkoff S, Todd JA, Vidal M, Wagner GP, Wang J, Weigel D, Young R: Ten years of genetics and genomics: what have we achieved and where are we heading? Nat Rev Genet 2010, 11(10):723-733.
- [4]Collins F: Has the revolution arrived? Nature 2010, 464(7289):674-675.
- [5]Consortium TEP: The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 2004, 306(5696):636-640.
- [6]Celniker SE, Dillon LA, Gerstein MB, Gunsalus KC, Henikoff S, Karpen GH, Kellis M, Lai EC, Lieb JD, MacAlpine DM, Micklem G, Piano F, Snyder M, Stein L, White KP, Waterston RH: Unlocking the secrets of the genome. Nature 2009, 459(7249):927-930.
- [7]Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, Kellis M, Marra MA, Beaudet AL, Ecker JR, Farnham PJ, Hirst M, Lander ES, Mikkelsen TS, Thomson JA: The NIH roadmap epigenomics mapping consortium. Nat Biotechnol 2010, 28(10):1045-1048.
- [8]Hartwell LH, Hopfield JJ, Leibler S, Murray AW: From molecular to modular cell biology. Nature 1999, 402(6761 Suppl):C47-C52.
- [9]Temple G, Gerhard DS, Rasooly R, Feingold EA, Good PJ, Robinson C, Mandich A, Derge JG, Lewis J, Shoaf D, Collins FS, Jang W, Wagner L, Shenmen CM, Misquitta L, Schaefer CF, Buetow KH, Bonner TI, Yankie L, Ward M, Phan L, Astashyn A, Brown G, Farrell C, Hart J, Landrum M, Maidak BL, Murphy M, Murphy T, Rajput B, et al.: The completion of the Mammalian Gene Collection (MGC). Genome Res 2009, 19(12):2324-2333.
- [10]Maston GA, Evans SK, Green MR: Transcriptional regulatory elements in the human genome. Annu Rev Genomics Hum Genet 2006, 7:29-59.
- [11]Narlikar L, Ovcharenko I: Identifying regulatory elements in eukaryotic genomes. Brief Funct Genomic Proteomic 2009, 8(4):215-230.
- [12]Alexander RP, Fang G, Rozowsky J, Snyder M, Gerstein MB: Annotating non-coding regions of the genome. Nat Rev Genet 2010, 11(8):559-571.
- [13]Davidson EH: The Regulatory Genome: Gene Regulatory Networks In Development and Evolution. Waltham, Massachusetts: Academic Press; 2006.
- [14]Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ, Makeev VJ, Mironov AA, Noble WS, Pavesi G, Pesole G, Regnier M, Simonis N, Sinha S, Thijs G, van Helden J, Vandenbogaert M, Weng Z, Workman C, Ye C, Zhu Z: Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005, 23(1):137-144.
- [15]Heintzman ND, Ren B: Finding distal regulatory elements in the human genome. Curr Opin Genet Dev 2009, 19(6):541-549.
- [16]Hardison RC, Taylor J: Genomic approaches towards finding cis-regulatory modules in animals. Nat Rev Genet 2012, 13(7):469-483.
- [17]Taher L, McGaughey DM, Maragh S, Aneas I, Bessling SL, Miller W, Nobrega MA, McCallion AS, Ovcharenko I: Genome-wide identification of conserved regulatory function in diverged sequences. Genome Res 2011, 21(7):1139-1149.
- [18]Johnson DS, Mortazavi A, Myers RM, Wold B: Genome-wide mapping of in vivo protein-DNA interactions. Science 2007, 316(5830):1497-1502.
- [19]Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 2007, 4(8):651-657.
- [20]Chen X, Xu H, Yuan P, Fang F, Huss M, Vega VB, Wong E, Orlov YL, Zhang W, Jiang J, Loh YH, Yeo HC, Yeo ZX, Narang V, Govindarajan KR, Leong B, Shahab A, Ruan Y, Bourque G, Sung WK, Clarke ND, Wei CL, Ng HH: Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 2008, 133(6):1106-1117.
- [21]Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell 2007, 129(4):823-837.
- [22]Boyle AP, Davis S, Shulha HP, Meltzer P, Margulies EH, Weng Z, Furey TS, Crawford GE: High-resolution mapping and characterization of open chromatin across the genome. Cell 2008, 132(2):311-322.
- [23]Song L, Zhang Z, Grasfeder LL, Boyle AP, Giresi PG, Lee BK, Sheffield NC, Graf S, Huss M, Keefe D, Liu Z, London D, McDaniell RM, Shibata Y, Showers KA, Simon JM, Vales T, Wang T, Winter D, Clarke ND, Birney E, Iyer VR, Crawford GE, Lieb JD, Furey TS: Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res 2011, 21(10):1757-1767.
- [24]Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D, Zhou D, Luo S, Vasicek TJ, Daly MJ, Wolfsberg TG, Collins FS: Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res 2006, 16(1):123-131.
- [25]Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 2009, 326(5950):289-293.
- [26]Belton JM, McCord RP, Gibcus JH, Naumova N, Zhan Y, Dekker J: Hi-C: a comprehensive technique to capture the conformation of genomes. Methods 2012, 58(3):268-276.
- [27]Laajala TD, Raghav S, Tuomela S, Lahesmaa R, Aittokallio T, Elo LL: A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments. BMC Genomics 2009, 10:618. BioMed Central Full Text
- [28]Park PJ: ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 2009, 10(10):669-680.
- [29]Pepke S, Wold B, Mortazavi A: Computation for ChIP-seq and RNA-seq studies. Nat Methods 2009, 6(11 Suppl):S22-S32.
- [30]Fauteux F, Blanchette M, Stromvik MV: Seeder: discriminative seeding DNA motif discovery. Bioinformatics 2008, 24(20):2303-2307.
- [31]Ettwiller L, Paten B, Ramialison M, Birney E, Wittbrodt J: Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation. Nat Methods 2007, 4(7):563-565.
- [32]Kulakovskiy IV, Boeva VA, Favorov AV, Makeev VJ: Deep and wide digging for binding motifs in ChIP-Seq data. Bioinformatics 2010, 26(20):2622-2623.
- [33]Hu M, Yu J, Taylor JM, Chinnaiyan AM, Qin ZS: On the detection and refinement of transcription factor binding sites using ChIP-Seq data. Nucleic Acids Res 2010, 38(7):2154-2167.
- [34]Mason MJ, Plath K, Zhou Q: Identification of context-dependent motifs by contrasting ChIP binding data. Bioinformatics 2010, 26(22):2826-2832.
- [35]Reid JE, Wernisch L: STEME: efficient EM to find motifs in large data sets. Nucleic Acids Res 2011, 39(18):e126.
- [36]Bailey TL: DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics 2011, 27(12):1653-1659.
- [37]Huggins P, Zhong S, Shiff I, Beckerman R, Laptenko O, Prives C, Schulz MH, Simon I, Bar-Joseph Z: DECOD: fast and accurate discriminative DNA motif finding. Bioinformatics 2011, 27(17):2361-2367.
- [38]Thomas-Chollier M, Herrmann C, Defrance M, Sand O, Thieffry D, van Helden J: RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res 2012, 40(4):e31.
- [39]Ma X, Kulkarni A, Zhang Z, Xuan Z, Serfling R, Zhang MQ: A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information. Nucleic Acids Res 2012, 40(7):e50.
- [40]Whitington T, Frith MC, Johnson J, Bailey TL: Inferring transcription factor complexes from ChIP-seq data. Nucleic Acids Res 2011, 39(15):e98.
- [41]Sun H, Guns T, Fierro AC, Thorrez L, Nijssen S, Marchal K: Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection. Nucleic Acids Res 2012, 40(12):e90.
- [42]Negre N, Brown CD, Ma L, Bristow CA, Miller SW, Wagner U, Kheradpour P, Eaton ML, Loriaux P, Sealfon R, Li Z, Ishii H, Spokony RF, Chen J, Hwang L, Cheng C, Auburn RP, Davis MB, Domanus M, Shah PK, Morrison CA, Zieba J, Suchy S, Senderowicz L, Victorsen A, Bild NA, Grundstad AJ, Hanley D, MacAlpine DM, Mannervik M, et al.: A cis-regulatory map of the Drosophila genome. Nature 2011, 471(7339):527-531.
- [43]Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung MS, Clawson H, Contrino S, et al.: Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 2010, 330(6012):1775-1787.
- [44]Zhang Z, Chang CW, Goh WL, Sung WK, Cheung E, Web Server issue: CENTDIST: discovery of co-associated factors by motif distribution. Nucleic Acids Res 2011, 39:W391-W399.
- [45]ENCODE: A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 2011, 9(4):e1001046.
- [46]Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, Ma L, Lin MF, Washietl S, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, Brown CD, Candeias R, Carlson JW, Carr A, Jungreis I, Marbach D, Sealfon R, Tolstorukov MY, Will S, Alekseyenko AA, Artieri C, Booth BW, et al.: Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010, 330(6012):1787-1797.
- [47]Chen G, Zhou Q: Searching ChIP-seq genomic islands for combinatorial regulatory codes in mouse embryonic stem cells. BMC Genomics 2011, 12:515. BioMed Central Full Text
- [48]Wingender E, Chen X, Fricke E, Geffers R, Hehl R, Liebich I, Krull M, Matys V, Michael H, Ohnhauser R, Pruss M, Schacherer F, Thiele S, Urbach S: The TRANSFAC system on gene expression regulation. Nucleic Acids Res 2001, 29:281-283.
- [49]Vlieghe D, Sandelin A, De Bleser PJ, Vleminckx K, Wasserman WW, van Roy F, Lenhard B: A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res 2006, 34(Database issue):D95-D97.
- [50]Wang J, Zhuang J, Iyer S, Lin X, Whitfield TW, Greven MC, Pierce BG, Dong X, Kundaje A, Cheng Y, Rando OJ, Birney E, Myers RM, Noble WS, Snyder M, Weng Z: Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res 2012, 22(9):1798-1812.
- [51]Wang J, Zhuang J, Iyer S, Lin XY, Greven MC, Kim BH, Moore J, Pierce BG, Dong X, Virgil D, Birney E, Hung JH, Weng Z: Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res 2013, 41(Database issue):D171-D176.
- [52]Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh CH, Minokawa T, Amore G, Hinman V, Arenas-Mena C, Otim O, Brown CT, Livi CB, Lee PY, Revilla R, Rust AG, Pan Z, Schilstra MJ, Clarke PJ, Arnone MI, Rowen L, Cameron RA, McClay DR, Hood L, Bolouri H: A genomic regulatory network for development. Science 2002, 295(5560):1669-1678.
- [53]Li XY, MacArthur S, Bourgon R, Nix D, Pollard DA, Iyer VN, Hechmer A, Simirenko L, Stapleton M, Luengo Hendriks CL, Chu HC, Ogawa N, Inwood W, Sementchenko V, Beaton A, Weiszmann R, Celniker SE, Knowles DW, Gingeras T, Speed TP, Eisen MB, Biggin MD: Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS Biol 2008, 6(2):e27.
- [54]Gallo SM, Gerrard DT, Miner D, Simich M, Des Soye B, Bergman CM, Halfon MS MS: REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila. Nucleic Acids Res 2011, 39(Database issue):D118-D123.
- [55]Ip YT, Park RE, Kosman D, Yazdanbakhsh K, Levine M: Dorsal-twist interactions establish snail expression in the presumptive mesoderm of the Drosophila embryo. Genes Dev 1992, 6(8):1518-1530.
- [56]Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, et al.: Architecture of the human regulatory network derived from ENCODE data. Nature 2012, 489(7414):91-100.
- [57]Machanick P, Bailey TL: MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 2011, 27(12):1696-1697.
- [58]Mathelier A, Wasserman WW: The next generation of transcription factor binding site prediction. PLoS Comput Biol 2013, 9(9):e1003214.
- [59]Tran NT, Huang CH: A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data. Biol Direct 2014, 9(1):4. BioMed Central Full Text
- [60]Bolouri H, Ruzzo WL: Integration of 198 ChIP-seq datasets reveals human cis-regulatory regions. J Comput Biol 2012, 19(9):989-997.
- [61]van Dongen S: A cluster Algorithm for Graphs. Amsterdam: National Research Institute for Mathematics and Computer Science in the Netherlands; 2000.
- [62]Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS: Quantifying similarity between motifs. Genome Biol 2007, 8(2):R24. BioMed Central Full Text
- [63]Bergman CM, Carlson JW, Celniker SE: Drosophila DNase I footprint database: a systematic genome annotation of transcription factor binding sites in the fruitfly, Drosophila melanogaster. Bioinformatics 2005, 21(8):1747-1749.
- [64]Zhu LJ, Christensen RG, Kazemian M, Hull CJ, Enuameh MS, Basciotta MD, Brasefield JA, Zhu C, Asriyan Y, Lapointe DS, Sinha S, Wolfe SA, Brodsky MH: FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one-hybrid system. Nucleic Acids Res 2011, 39(Database issue):D111-D117.
- [65]Kulakovskiy IV, Makeev VJ: Discovery of DNA motifs recognized by transcription factors through integration of different experimental sources. Biophysics 2009, 54(6):667-674.
- [66]Brand AH, van Roessel PJ: Region-specific apoptosis limits neural stem cell proliferation. Neuron 2003, 37(2):185-187.
- [67]Thomas JB, Crews ST, Goodman CS: Molecular genetics of the single-minded locus: a gene involved in the development of the Drosophila nervous system. Cell 1988, 52(1):133-141.
- [68]Sanyal S, Narayanan R, Consoulas C, Ramaswami M: Evidence for cell autonomous AP1 function in regulation of Drosophila motor-neuron plasticity. BMC Neurosci 2003, 4:20. BioMed Central Full Text
- [69]De Graeve F, Jagla T, Daponte JP, Rickert C, Dastugue B, Urban J, Jagla K: The ladybird homeobox genes are essential for the specification of a subpopulation of neural cells. Dev Biol 2004, 270(1):122-134.
- [70]Bates KE, Sung CS, Robinow S: The unfulfilled gene is required for the development of mushroom body neuropil in Drosophila. Neural Dev 2010, 5:4. BioMed Central Full Text
- [71]Tanaka KK, Bryantsev AL, Cripps RM: Myocyte enhancer factor 2 and chorion factor 2 collaborate in activation of the myogenic program in Drosophila. Mol Cell Biol 2008, 28(5):1616-1629.
- [72]Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, Haussler D: Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 2005, 15(8):1034-1050.
- [73]Halligan DL, Keightley PD: Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. Genome Res 2006, 16(7):875-884.
- [74]Halligan DL, Eyre-Walker A, Andolfatto P, Keightley PD: Patterns of evolutionary constraints in intronic and intergenic DNA of Drosophila. Genome Res 2004, 14(2):273-279.
- [75]Andolfatto P: Adaptive evolution of non-coding DNA in Drosophila. Nature 2005, 437(7062):1149-1152.
- [76]Casillas S, Barbadilla A, Bergman CM: Purifying selection maintains highly conserved noncoding sequences in Drosophila. Mol Biol Evol 2007, 24(10):2222-2234.
- [77]Bergman CM, Kreitman M: Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res 2001, 11(8):1335-1345.
- [78]Singh ND, Arndt PF, Clark AG, Aquadro CF: Strong evidence for lineage and sequence specificity of substitution rates and patterns in Drosophila. Mol Biol Evol 2009, 26(7):1591-1605.
- [79]Kondrashov AS: Evolutionary biology: fruitfly genome is not junk. Nature 2005, 437(7062):1106.
- [80]Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 2003, 4(5):3. BioMed Central Full Text
- [81]Ciglar L, Furlong EE: Conservation and divergence in developmental networks: a view from Drosophila myogenesis. Curr Opin Cell Biol 2009, 21(6):754-760.
- [82]Zeitlinger J, Stark A: Developmental gene regulation in the era of genomics. Dev Biol 2010, 339(2):230-239.
- [83]Woolfe A, Goodson M, Goode DK, Snell P, McEwen GK, Vavouri T, Smith SF, North P, Callaway H, Kelly K, Walter K, Abnizova I, Gilks W, Edwards YJ, Cooke JE, Elgar G: Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol 2005, 3(1):e7.
- [84]Wray GA: The evolutionary significance of cis-regulatory mutations. Nat Rev Genet 2007, 8(3):206-216.
- [85]Zhang Z, Pugh BF: High-resolution genome-wide mapping of the primary structure of chromatin. Cell 2011, 144(2):175-186.
- [86]Moses AM, Pollard DA, Nix DA, Iyer VN, Li XY, Biggin MD, Eisen MB: Large-scale turnover of functional transcription factor binding sites in Drosophila. PLoS Comput Biol 2006, 2(10):e130.
- [87]Wittkopp PJ, Kalay G: Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat Rev Genet 2012, 13(1):59-69.
- [88]Sandelin A, Wasserman WW: Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics. J Mol Biol 2004, 338(2):207-215.
- [89]Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, Morgunova E, Enge M, Taipale M, Wei G, Palin K, Vaquerizas JM, Vincentelli R, Luscombe NM, Hughes TR, Lemaire P, Ukkonen E, Kivioja T, Taipale J: DNA-binding specificities of human transcription factors. Cell 2013, 152(1–2):327-339.
- [90]Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE: Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011, 473(7345):43-49.
- [91]Ram O, Goren A, Amit I, Shoresh N, Yosef N, Ernst J, Kellis M, Gymrek M, Issner R, Coyne M, Durham T, Zhang X, Donaghey J, Epstein CB, Regev A, Bernstein BE: Combinatorial patterning of chromatin regulators uncovered by genome-wide location analysis in human cells. Cell 2011, 147(7):1628-1639.
- [92]Zhou VW, Goren A, Bernstein BE: Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet 2011, 12(1):7-18.
- [93]Zhu J, Adli M, Zou JY, Verstappen G, Coyne M, Zhang X, Durham T, Miri M, Deshpande V, De Jager PL, Bennett DA, Houmard JA, Muoio DM, Onder TT, Camahort R, Cowan CA, Meissner A, Epstein CB, Shoresh N, Bernstein BE: Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 2013, 152(3):642-654.
- [94]Jiang C, Pugh BF: Nucleosome positioning and gene regulation: advances through genomics. Nat Rev Genet 2009, 10(3):161-172.
- [95]Ioshikhes I, Hosid S, Pugh BF: Variety of genomic DNA patterns for nucleosome positioning. Genome Res 2011, 21(11):1863-1871.
- [96]Fraser HB: Gene expression drives local adaptation in humans. Genome Res 2013, 23(7):1089-1096.
- [97]Ye K, Lu J, Raj SM, Gu Z: Human expression QTLs are enriched in signals of environmental adaptation. Genome Biol Evol 2013, 5(9):1689-1701.
- [98]Babak T, Garrett-Engele P, Armour CD, Raymond CK, Keller MP, Chen R, Rohl CA, Johnson JM, Attie AD, Fraser HB, Schadt EE: Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation. BMC Genomics 2010, 11:473. BioMed Central Full Text
- [99]Stranger BE, Montgomery SB, Dimas AS, Parts L, Stegle O, Ingle CE, Sekowska M, Smith GD, Evans D, Gutierrez-Arcelus M, Price A, Raj T, Nisbett J, Nica AC, Beazley C, Durbin R, Deloukas P, Dermitzakis ET: Patterns of cis regulatory variation in diverse human populations. PLoS Genet 2012, 8(4):e1002639.
- [100]Vernot B, Stergachis AB, Maurano MT, Vierstra J, Neph S, Thurman RE, Stamatoyannopoulos JA, Akey JM: Personal and population genomics of human regulatory variation. Genome Res 2012, 22(9):1689-1697.
- [101]Zheng W, Zhao H, Mancera E, Steinmetz LM, Snyder M: Genetic analysis of variation in transcription factor binding in yeast. Nature 2010, 464(7292):1187-1191.
- [102]Kasowski M, Kyriazopoulou-Panagiotopoulou S, Grubert F, Zaugg JB, Kundaje A, Liu Y, Boyle AP, Zhang QC, Zakharia F, Spacek DV, Li J, Xie D, Olarerin-George A, Steinmetz LM, Hogenesch JB, Kellis M, Batzoglou S, Snyder M: Extensive variation in chromatin states across humans. Science 2013, 342(6159):750-752.
- [103]Kasowski M, Grubert F, Heffelfinger C, Hariharan M, Asabere A, Waszak SM, Habegger L, Rozowsky J, Shi M, Urban AE, Hong MY, Karczewski KJ, Huber W, Weissman SM, Gerstein MB, Korbel JO, Snyder M: Variation in transcription factor binding among humans. Science 2010, 328(5975):232-235.
- [104]Haraksingh RR, Snyder MP: Impacts of variation in the human genome on gene regulation. J Mol Biol 2013, 425(21):3970-3977.
- [105]Zhang S, Xu M, Li S, Su Z: Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes. Nucleic Acids Res 2009, 37(10):e72.
- [106]Zhang S, Li S, Pham PT, Su Z: Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes. BMC Bioinformatics 2010, 11:397. BioMed Central Full Text
- [107]Zhang S, Jiang L, Du C, Su Z: A novel information content-based similarity metric for comparing transcription factor binding site motifs. IEEE 6th International Conference on Systems Biology (ISB)2012:32–36
- [108]van Dongen S, Abreu-Goodger C: Using MCL to extract clusters from networks. Methods Mol Biol 2012, 804:281-295.
- [109]Vlasblom J, Wodak SJ: Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC Bioinformatics 2009, 10:99. BioMed Central Full Text
- [110]Brohee S, van Helden J: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 2006, 7:488. BioMed Central Full Text
- [111]Samuel Lattimore B, van Dongen S, Crabbe MJ: GeneMCL in microarray analysis. Comput Biol Chem 2005, 29(5):354-359.
- [112]Enright AJ, Van Dongen S, Ouzounis CA: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 2002, 30(7):1575-1584.
PDF