期刊论文详细信息
BMC Bioinformatics
G-NEST: a gene neighborhood scoring tool to identify co-conserved, co-expressed genes
Danielle G Lemay2  William F Martin3  Angie S Hinrichs5  Monique Rijnkels1  J Bruce German3  Ian Korf2  Katherine S Pollard4 
[1] Department of Pediatrics, Children’s Nutrition Research Center, Baylor College of Medicine, 1100 Bates Street, Houston, TX, 77030, United States of America
[2] Genome Center, University of California Davis, 451 Health Science Dr, Davis, CA, 95616, United States of America
[3] Department of Food Science and Technology, University of California Davis, One Shields Avenue, Davis, CA, 95616, United States of America
[4] Gladstone Institutes, Division of Biostatistics and Institute for Human Genetics, University of California San Francisco, 1650 Owens St, San Francisco, CA, 94158, United States of America
[5] Center for Biomolecular Science and Engineering, University of California Santa Cruz, 1156 High St, Santa Cruz, CA, 95064, United States of America
关键词: Evolution;    Bioinformatics;    Gene cluster;    Gene neighborhood;    Cluster analysis;    Transcription;    Gene duplication;    Gene expression;    Genomics;    Computational biology;   
Others  :  1088117
DOI  :  10.1186/1471-2105-13-253
 received in 2012-03-31, accepted in 2012-09-23,  发布年份 2012
PDF
【 摘 要 】

Background

In previous studies, gene neighborhoods—spatial clusters of co-expressed genes in the genome—have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Scoring Tool (G-NEST) which combines genomic location, gene expression, and evolutionary sequence conservation data to score putative gene neighborhoods across all possible window sizes simultaneously.

Results

Using G-NEST on atlases of mouse and human tissue expression data, we found that large neighborhoods of ten or more genes are extremely rare in mammalian genomes. When they do occur, neighborhoods are typically composed of families of related genes. Both the highest scoring and the largest neighborhoods in mammalian genomes are formed by tandem gene duplication. Mammalian gene neighborhoods contain highly and variably expressed genes. Co-localized noisy gene pairs exhibit lower evolutionary conservation of their adjacent genome locations, suggesting that their shared transcriptional background may be disadvantageous. Genes that are essential to mammalian survival and reproduction are less likely to occur in neighborhoods, although neighborhoods are enriched with genes that function in mitosis. We also found that gene orientation and protein-protein interactions are partially responsible for maintenance of gene neighborhoods.

Conclusions

Our experiments using G-NEST confirm that tandem gene duplication is the primary driver of non-random gene order in mammalian genomes. Non-essentiality, co-functionality, gene orientation, and protein-protein interactions are additional forces that maintain gene neighborhoods, especially those formed by tandem duplicates. We expect G-NEST to be useful for other applications such as the identification of core regulatory modules, common transcriptional backgrounds, and chromatin domains. The software is available at http://docpollard.org/software.html webcite

【 授权许可】

   
2012 Lemay et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117075159372.pdf 2316KB PDF download
Figure 7. 74KB Image download
Figure 6. 60KB Image download
Figure 5. 54KB Image download
Figure 4. 128KB Image download
Figure 3. 110KB Image download
Figure 2. 54KB Image download
Figure 1. 99KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

【 参考文献 】
  • [1]Fukuoka Y, Inaoka H, Kohane IS: Inter-species differences of co-expression of neighboring genes in eukaryotic genomes. BMC Genomics 2004, 5(1):4. BioMed Central Full Text
  • [2]Lee JM, Sonnhammer EL: Genomic gene clustering analysis of pathways in eukaryotes. Genome Res 2003, 13(5):875-882.
  • [3]Poyatos JF, Hurst LD: The determinants of gene order conservation in yeasts. Genome Biol 2007, 8(11):R233. BioMed Central Full Text
  • [4]Batada NN, Urrutia AO, Hurst LD: Chromatin remodelling is a major source of coexpression of linked genes in yeast. Trends Genet 2007, 23(10):480-484.
  • [5]Blumenthal T, Evans D, Link CD, Guffanti A, Lawson D, Thierry-Mieg J, Thierry-Mieg D, Chiu WL, Duke K, Kiraly M, et al.: A global analysis of Caenorhabditis elegans operons. Nature 2002, 417(6891):851-854.
  • [6]Kamath RS, Fraser AG, Dong Y, Poulin G, Durbin R, Gotta M, Kanapin A, Le Bot N, Moreno S, Sohrmann M, et al.: Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 2003, 421(6920):231-237.
  • [7]Roy PJ, Stuart JM, Lund J, Kim SK: Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans. Nature 2002, 418(6901):975-979.
  • [8]Yanai I, Hunter CP: Comparison of diverse developmental transcriptomes reveals that coexpression of gene neighbors is not evolutionarily conserved. Genome Res 2009, 19(12):2214-2220.
  • [9]Boutanaev AM, Kalmykova AI, Shevelyov YY, Nurminsky DI: Large clusters of co-expressed genes in the Drosophila genome. Nature 2002, 420(6916):666-669.
  • [10]Spellman PT, Rubin GM: Evidence for large domains of similarly expressed genes in the Drosophila genome. J Biol 2002, 1(1):5. BioMed Central Full Text
  • [11]Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, et al.: The transcriptional landscape of the mammalian genome. Science 2005, 309(5740):1559-1563.
  • [12]Purmann A, Toedling J, Schueler M, Carninci P, Lehrach H, Hayashizaki Y, Huber W, Sperling S: Genomic organization of transcriptomes in mammals: coregulation and cofunctionality. Genomics 2007, 89(5):580-587.
  • [13]Semon M, Duret L: Evolutionary origin and maintenance of coexpressed gene clusters in mammals. Mol Biol Evol 2006, 23(9):1715-1723.
  • [14]Vogel JH, von Heydebreck A, Purmann A, Sperling S: Chromosomal clustering of a human transcriptome reveals regulatory background. BMC Bioinforma 2005, 6:230. BioMed Central Full Text
  • [15]Singer GA, Lloyd AT, Huminiecki LB, Wolfe KH: Clusters of co-expressed genes in mammalian genomes are conserved by natural selection. Mol Biol Evol 2005, 22(3):767-775.
  • [16]Lemay DG, Lynn DJ, Martin WF, Neville MC, Casey TM, Rincon G, Kriventseva EV, Barris WC, Hinrichs AS, Molenaar AJ, et al.: The bovine lactation genome: insights into the evolution of mammalian milk. Genome Biol 2009, 10(4):R43. BioMed Central Full Text
  • [17]De S, Teichmann SA, Babu MM: The impact of genomic neighborhood on the evolution of human and chimpanzee transcriptome. Genome Res 2009, 19(5):785-794.
  • [18]Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P: Coexpression analysis of human genes across many microarray data sets. Genome Res 2004, 14(6):1085-1094.
  • [19]Barrans JD, Ip J, Lam CW, Hwang IL, Dzau VJ, Liew CC: Chromosomal distribution of the human cardiovascular transcriptome. Genomics 2003, 81(5):519-524.
  • [20]Bortoluzzi S, Rampoldi L, Simionati B, Zimbello R, Barbon A, d’Alessi F, Tiso N, Pallavicini A, Toppo S, Cannata N, et al.: A comprehensive, high-resolution genomic transcript map of human skeletal muscle. Genome Res 1998, 8(8):817-825.
  • [21]Gabrielsson BL, Carlsson B, Carlsson LM: Partial genome scale analysis of gene expression in human adipose tissue using DNA array. Obes Res 2000, 8(5):374-384.
  • [22]Megy K, Audic S, Claverie JM: Positional clustering of differentially expressed genes on human chromosomes 20, 21 and 22. Genome Biol 2003, 4(2):P1. BioMed Central Full Text
  • [23]Caron H, van Schaik B, van der Mee M, Baas F, Riggins G, van Sluis P, Hermus MC, van Asperen R, Boon K, Voute PA, et al.: The human transcriptome map: clustering of highly expressed genes in chromosomal domains. Science 2001, 291(5507):1289-1292.
  • [24]Versteeg R, van Schaik BD, van Batenburg MF, Roos M, Monajemi R, Caron H, Bussemaker HJ, van Kampen AH: The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. Genome Res 2003, 13(9):1998-2004.
  • [25]Li Q, Lee BT, Zhang L: Genome-scale analysis of positional clustering of mouse testis-specific genes. BMC Genomics 2005, 6(1):7. BioMed Central Full Text
  • [26]Lercher MJ, Hurst LD: Co-expressed yeast genes cluster over a long range but are not regularly spaced. J Mol Biol 2006, 359(3):825-831.
  • [27]Cohen BA, Mitra RD, Hughes JD, Church GM: A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nat Genet 2000, 26(2):183-186.
  • [28]Zhang H, Pan KH, Cohen SN: Senescence-specific gene expression fingerprints reveal cell-type-dependent physical clustering of up-regulated chromosomal loci. Proc Natl Acad Sci USA 2003, 100(6):3251-3256.
  • [29]Weber CC, Hurst LD: Support for multiple classes of local expression clusters in Drosophila melanogaster, but no evidence for gene order conservation. Genome Biol 2011, 12(3):R23. BioMed Central Full Text
  • [30]Ohno S: Evolution by gene duplication. New York: Springer-Verlag, Berlin; 1970.
  • [31]de Wit E, van Steensel B: Chromatin domains in higher eukaryotes: insights from genome-wide mapping studies. Chromosoma 2009, 118(1):25-36.
  • [32]Yi G, Sze SH, Thon MR: Identifying clusters of functionally related genes in genomes. Bioinformatics 2007, 23(9):1053-1060.
  • [33]Al-Shahrour F, Minguez P, Marques-Bonet T, Gazave E, Navarro A, Dopazo J: Selection upon genome architecture: conservation of functional neighborhoods with changing genes. PLoS Comput Biol 2010, 6(10):e1000953.
  • [34]Batada NN, Hurst LD: Evolution of chromosome organization driven by selection for reduced gene expression noise. Nat Genet 2007, 39(8):945-949.
  • [35]Teichmann SA, Veitia RA: Genes encoding subunits of stable complexes are clustered on the yeast chromosomes: an interpretation from a dosage balance perspective. Genetics 2004, 167(4):2121-2125.
  • [36]Overbeek R, Fonstein M, D’Souza M, Pusch GD, Maltsev N: The use of gene clusters to infer functional coupling. Proc Natl Acad Sci USA 1999, 96(6):2896-2901.
  • [37]Tamames J: Evolution of gene order conservation in prokaryotes. Genome Biol 2001, 2(6):RESEARCH0020.
  • [38]Wolf YI, Rogozin IB, Kondrashov AS, Koonin EV: Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context. Genome Res 2001, 11(3):356-372.
  • [39]Price MN, Huang KH, Arkin AP, Alm EJ: Operon formation is driven by co-regulation and not by horizontal gene transfer. Genome Res 2005, 15(6):809-819.
  • [40]Wu H, Su Z, Mao F, Olman V, Xu Y: Prediction of functional modules based on comparative genome analysis and Gene Ontology application. Nucleic Acids Res 2005, 33(9):2822-2837.
  • [41]Kolesov G, Mewes HW, Frishman D: SNAPping up functionally related genes based on context information: a colinearity-free approach. J Mol Biol 2001, 311(4):639-656.
  • [42]Rogozin IB, Makarova KS, Murvai J, Czabarka E, Wolf YI, Tatusov RL, Szekely LA, Koonin EV: Connected gene neighborhoods in prokaryotic genomes. Nucleic Acids Res 2002, 30(10):2212-2223.
  • [43]Zheng Y, Anton BP, Roberts RJ, Kasif S: Phylogenetic detection of conserved gene clusters in microbial genomes. BMC Bioinforma 2005, 6:243. BioMed Central Full Text
  • [44]Ling X, He X, Xin D, Han J: Efficiently identifying max-gap clusters in pairwise genome comparison. J Comput Biol 2008, 15(6):593-609.
  • [45]Ling X, He X, Xin D: Detecting gene clusters under evolutionary constraint in a large number of genomes. Bioinformatics 2009, 25(5):571-577.
  • [46]He X, Goldwasser MH: Identifying conserved gene clusters in the presence of homology families. J Comput Biol 2005, 12(6):638-656.
  • [47]Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res 2002, 12(6):996-1006.
  • [48]Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, et al.: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 2004, 101(16):6062-6067.
  • [49]Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al.: The evolution of gene expression levels in mammalian organs. Nature 2011, 478(7369):343-348.
  • [50]Liao BY, Zhang J: Coexpression of linked genes in Mammalian genomes is generally disadvantageous. Mol Biol Evol 2008, 25(8):1555-1565.
  • [51]Lercher MJ, Blumenthal T, Hurst LD: Coexpression of neighboring genes in Caenorhabditis elegans is mostly due to operons and duplicate genes. Genome Res 2003, 13(2):238-243.
  • [52]Hurst LD, Pal C, Lercher MJ: The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet 2004, 5(4):299-310.
  • [53]Rijnkels M: Multispecies comparison of the casein gene loci and evolution of casein gene family. J Mammary Gland Biol Neoplasia 2002, 7(3):327-345.
  • [54]Rijnkels M, Elnitski L, Miller W, Rosen JM: Multispecies comparative analysis of a mammalian-specific genomic domain encoding secretory proteins. Genomics 2003, 82(4):417-432.
  • [55]Rijnkels M, Kabotyanski E, Montazer-Torbati MB, Hue Beauvais C, Vassetzky Y, Rosen JM, Devinoy E: The epigenetic landscape of mammary gland development and functional differentiation. J Mammary Gland Biol Neoplasia 2010, 15(1):85-100.
  • [56]Lercher MJ, Urrutia AO, Hurst LD: Clustering of housekeeping genes provides a unified model of gene order in the human genome. Nat Genet 2002, 31(2):180-183.
  • [57]Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, Domany E, et al.: Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 2005, 21(5):650-659.
  • [58]Wang GZ, Lercher MJ, Hurst LD: Transcriptional coupling of neighboring genes and gene expression noise: evidence that gene orientation and noncoding transcripts are modulators of noise. Genome Biol Evol 2011, 3:320-331.
  • [59]Liao BY, Zhang J: Mouse duplicate genes are as essential as singletons. Trends Genet 2007, 23(8):378-381.
  • [60]Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 2005, 102(43):15545-15550.
  • [61]Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, et al.: PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 2003, 34(3):267-273.
  • [62]Flicek P, Amode MR, Barrell D, Beal K, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, et al.: Ensembl 2011. Nucleic Acids Res 2011, 39(Database issue):D800-D806.
  • [63]Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80. BioMed Central Full Text
  • [64]Harr B, Schlotterer C: Comparison of algorithms for the analysis of Affymetrix microarray data as evaluated by co-expression of genes in known operons. Nucleic Acids Res 2006, 34(2):e8.
  • [65]Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D: Evolution’s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci USA 2003, 100(20):11484-11489.
  • [66]Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, et al.: The UCSC Genome Browser database: update 2011. Nucleic Acids Res 2011, 39(Database issue):D876-D882.
  • [67]Sinha AU, Meller J: Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms. BMC Bioinforma 2007, 8:82. BioMed Central Full Text
  文献评价指标  
  下载次数:106次 浏览次数:31次