BMC Bioinformatics | |
A novel method for cross-species gene expression analysis | |
Erik Kristiansson1  Tobias Österlund5  Lina Gunnarsson4  Gabriella Arne3  D G Joakim Larsson2  Olle Nerman1  | |
[1] Department of Mathematical Statistics, Chalmers University of Technology/University of Gothenburg, Gothenburg, Sweden | |
[2] Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy at the University of Gothenburg, Gothenburg, Sweden | |
[3] Sahlgrenska Cancer Center, Department of Pathology, Sahlgrenska Academy at The University of Gothenburg, Gothenburg, Sweden | |
[4] Institute of Neuroscience and Physiology, the Sahlgrenska Academy at the University of Gothenburg, Gothenburg, Sweden | |
[5] Department of Chemical and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden | |
关键词: RNA-seq; Microarray; Paralogs; Orthologs; Meta-analysis; Evolution; Gene expression; | |
Others : 1087964 DOI : 10.1186/1471-2105-14-70 |
|
received in 2012-06-25, accepted in 2013-02-13, 发布年份 2013 | |
【 摘 要 】
Background
Analysis of gene expression from different species is a powerful way to identify evolutionarily conserved transcriptional responses. However, due to evolutionary events such as gene duplication, there is no one-to-one correspondence between genes from different species which makes comparison of their expression profiles complex.
Results
In this paper we describe a new method for cross-species meta-analysis of gene expression. The method takes the homology structure between compared species into account and can therefore compare expression data from genes with any number of orthologs and paralogs. A simulation study shows that the proposed method results in a substantial increase in statistical power compared to previously suggested procedures. As a proof of concept, we analyzed microarray data from heat stress experiments performed in eight species and identified several well-known evolutionarily conserved transcriptional responses. The method was also applied to gene expression profiles from five studies of estrogen exposed fish and both known and potentially novel responses were identified.
Conclusions
The method described in this paper will further increase the potential and reliability of meta-analysis of gene expression profiles from evolutionarily distant species. The method has been implemented in R and is freely available athttp://bioinformatics.math.chalmers.se/Xspecies/ webcite.
【 授权许可】
2013 Kristiansson et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150117061848586.pdf | 1068KB | download | |
Figure 5. | 78KB | Image | download |
Figure 4. | 21KB | Image | download |
Figure 3. | 85KB | Image | download |
Figure 2. | 36KB | Image | download |
Figure 1. | 33KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
【 参考文献 】
- [1]Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Holko M, Ayanbule O, Yefanov A, Soboleva A: NCBI GEO: archive for functional genomics data sets – 10 years on. Nucleic Acids Res 2011, 39:D1005-D1010.
- [2]Parkinson H, Sarkans U, Kolesnikov N, Abeygunawardena N, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Holloway E, Kurbatova N, Lukk M, Malone J, Mani R, Pilicheva E, Rustici G, Sharma A, Williams E, Adamusiak T, Brandizi M, Sklyar N, Brazma A: ArrayExpress update–an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res 2011, 39:D1002-D1004.
- [3]Raser JM, O’Shea EK: Noise in gene expression: origins, consequences, and control. Science 2005, 309:2010-2013.
- [4]Taniguchi Y, Choi PJ, Li GW, Chen H, M Babu JH, Emili A, Xie XS: Quantifying E coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 2011, 329:533-538.
- [5]Allison DB, Cui X, Page GP, Sabripour M: Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet 2006, 7:55-56.
- [6]Kristiansson E, Sjögren A, Rudemo M, Nerman O: Weighted analysis of paired microarray experiments. Stat Appl Genet Mol Biol 2005, 4:Article 30.
- [7]Consortium M: The microarray quality control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 2006, 24:1151-1161.
- [8]Kuo WP, Liu F, Trimarchi J, Punzo C, Lombardi M, Sarang J, Whipple ME, Maysuria M, Serikawa K, Lee SY, McCrann D, Kang J, Shearstone JR, Burke J, Park DJ, Wang X, Rector TL, Ricciardi-Castagnoli P, Perrin S, Choi S, Bumgarner R, Kim JH, III GFS, Freeman MW, Seed B, Jensen R, Church GM, Hovig E, Cepko CL, Park P, Ohno-Machado L, Jenssen TK: A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. Nat Biotechnol 2006, 24:832-840.
- [9]Ala U, Piro RM, Grassi E, Damasco C, Silengo L, Oti M, Provero P, Di Cunto F: Prediction of human disease genes by human-mouse conserved coexpression analysis. PLoS Comput Biol 2009, 4:e1000043.
- [10]Segal E, Friedman N, Kaminski N, Regev A, Koller D: From signatures to models: understanding cancer using microarrays. Nat Genet 2005, 37:S38-S45.
- [11]Sweet-Cordero A, Mukherjee S, You ASH, Roix JJ, Ladd-Acosta C, Mesirov J, Golub TR, Jacks T: An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet 2005, 37:48-55.
- [12]Miller JA, Horvath S, Geschwind DH: Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways. Proc Natl Acad Sci 2010, 107:220-229.
- [13]Rasche A, Al-Hasani H, Herwig R: Meta-analysis approach identifies candidate genes and associated molecular networks for type-2 Diabetes mellitus. BMC Genomics 2008, 9:310. BioMed Central Full Text
- [14]Marques FZ, Campain AE, Yang YHJ, Morris BJ: Meta-analysis of genome-wide gene expression differences in onset and maintenance phases of genetic hypertension. Hypertension 2010, 56:319-324.
- [15]Ginis I, Luo Y, Miura T, Thies S, Brandenberger R, Gerecht-Nir S, Amit M, Hoke A, Carpenter MK, Itskovitz-Eldor J, Rao MS: Differences between human and mouse embryonic stem cells. Dev Biol 2004, 269:360-380.
- [16]Pan F, Chiu CH, Pulapura S, Mehan MR, Nunez-Iglesias J, Zhang K, Kamath K, Waterman MS, Finch CE, Zhou XJ: Gene Aging Nexus: a web database and data mining platform for microarray data on aging. Nucleic Acids Res 2007, 35:D756-D759.
- [17]de Magalhaes JP, Curado J, Church GM: Meta-analysis of age-related gene expression profiles identifies common signatures of aging. Bioinformatics 2009, 25:875-881.
- [18]Gunnarsson L, Kristiansson E, Rutgersson C, Sturve J, Fick J, Förlin L, Larsson DGJ: Pharmaceutical industry effluent diluted 1:500 affects global gene expression, cytochrome P450 1A activity, and plasma phosphate in fish. Environ Toxicol Chem 2010, 28:2639-37.
- [19]Gunnarsson L, Kristiansson E, Förlin L, Nerman O, Larsson DGJ: Sensitive and robust gene expression changes in fish exposed to estrogen–a microarray approach. BMC Genomics 2007, 8:149. BioMed Central Full Text
- [20]Ung CY, Lam SH, Hiaing MM, Winata CL, Korzh S, Mathavan S, Gong Z: Mercury-induced hepatotoxicity in zebrafish: in vivo mechanistic insights from transcriptome analysis, phenotype anchoring and targeted gene expression validation. BMC Genomics 2010, 11:212. BioMed Central Full Text
- [21]Kristensen DM, Wolf YI, Mushegian AR, Koonin EV: Computational methods for gene orthology inference. Brief in Bioinform 2011, 12:379-91.
- [22]Ohno S: Evolution by Gene Duplication. New York: Springer; 1970.
- [23]Gu Z, Rifkin SA, White KP, Li WH: Duplicate genes increase gene expression diversity within and between species. Nat Genet 2004, 36:577-579.
- [24]Huminiecki L, Wolfe KH: Divergence of spatial gene expression profiles following species-specific gene duplications in human and mouse. Genome Res 2004, 14:1870-1879.
- [25]Lynch M, Katju V: The altered evolutionary trajectories of gene duplicates. Trend Genet 2004, 20:544-9.
- [26]Studer R A, Robinson-Rechavi M: How confident can we be that orthologs are similar, but paralogs differ? Trends Genet 2009, 25:210-216.
- [27]Chen X, Zhang J: The ortholog conjecture is untestable by the current gene ontology but is supported by RNA sequencing data. PLoS Comput Biol 2012, 8:e1002784.
- [28]Fisher RA: Answer to question 14 on combining independent tests of significance. Amer Statistician 1948, 2:30.
- [29]Hu P, Greenwood CMT, Beyene J: Statistical methods for meta-analysis of microarray data: a comparative study. Inf Syst Front 2006, 8:9-20.
- [30]Campain A, Yang YH: Comparison study of microarray meta-analysis methods. BMC Bioinformatics 2010, 3:408.
- [31]Tseng GC, Ghosh D, Feingold E: Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Res 2012, 40:3785-3799.
- [32]Stuart JM, Segal E, Koller D, Kim SK: A gene-coexpression network for global discovery of conserved genetic modules. Science 2003, 10:249-255.
- [33]Le HS, Oltvai ZN, Bar-Joseph Z: Cross-species queries of large gene expression databases. Bioinformatics 2010, 26:2416-2423.
- [34]Cahan P, Ahmad AM, Burke H, Fu S, Lai Y, Florea L, Dharker N, Kobrinski T, Kale P, McCaffrey TA: List of lists-annotated (LOLA): a database for annotation and comparison of published microarray gene lists. Gene 2005, 24:78-82.
- [35]Newman JC, Weiner AM: L2L: a simple tool for discovering the hidden significance in microarray expression data. Genome Biol 2005, 6:R81. BioMed Central Full Text
- [36]Lu Y, Rosenfeld R, Bar-Joseph Z: Identifying cycling genes by combining sequence homology and expression data. Bioinformatics 2006, 22:e314-e322.
- [37]Lu Y, Mahony S, Benos PV, Rosenfeld R, Simon I, Breeden LL, Bar-Joseph Z: Combined analysis reveals a core set of cycling genes. Genome Biol 2007, 8:R146. BioMed Central Full Text
- [38]Lu Y, Rosenfeld R, Nau GJ, Bar-Joseph Z: Cross species expression analysis of innate immune response. J Comput Biol 2010, 17:253-68.
- [39]Ramasamy A, Mondry A, Holmes CC, Altman DG: Key issues in conducting a meta-analysis of gene expression microaray datasets. PLoS Med 2008, 5:e184.
- [40]Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, Feolo M, Fingerman IM, Geer LY, Helmberg W, Kapustin Y, D Landsman DJL, Lu Z, Madden TL, Madej T, Maglott DR, Miller AMBV, Mizrachi I, Ostell J, Panchenko A, Phan L, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Slotta D, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Wang Y, Wilbur WJ, Yaschenko E, Ye J: Database resources of the national center for biotechnology information. Nucleic Acids Res 2011, 39:D38-D51.
- [41]Chen F, Mackey AF, Jr CJS, Roos DS: OrthoMCL-DB: quering a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res 2006, 34:D363-D368.
- [42]Berglund AC, Sjölund E, Östlund G, Sonnhammer ELL: InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res 2008, 36:D263-D266.
- [43]Li L, Jr CJS, Roos DS: OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res 2003, 13:2178-2189.
- [44]Grützmann R, Boriss H, Ammerpohl O, Lüttges J, Kalthoff H, Schackert HK, Klöppel G, Saeger HD, Pilarsky C: Meta-analysis of microarray data on pancreatic cancer defines a set of commonly dysregulated genes. Oncogene 2005, 24:5079-5088.
- [45]Richter K, Haslbeck M, Buchner J: The heat shock response: life on the verge of death. Mol Cell 2010, 40:253-266.
- [46]Feder ME, Hoffman GE: Heat-shock proteins, molecular chaperones, and the stress response: evolutionary and ecological physiology. Annu Rev Physiol 1999, 61:243-282.
- [47]Laramie JM, Chung TP, Brownstein B, Cobb GDSJP: Transcriptional profiles of human epithelial cells in response to heat: computational evidence for novel heat shock proteins. Shock 2008, 29:623-630.
- [48]Vallant B, Anderssson SP, Brown-Borg HM, Ren H, Kersten S, Jonnalagadda S, Srinivasan R, Corton J: Analysis of the heat shock response in mouse liver reveals transcriptional dependence on the nuclear receptor peroxisome proliferatoractivated receptor a (PPARa). BMC Bioinformatics 2010, 11:16. BioMed Central Full Text
- [49]Sorensen JG, Nielsen MM, Kruhoffer M, Justesen J, Loeschcke V: Full genome gene expression analysis of the heat stress response in drosophila melanogaster. Cell Stress Chaperones 2005, 10:312-328.
- [50]Hu W, Hu G, Han B: Genome-wide survey and expression profiling of heat shock proteins and heat shock factors revealed overlapped and stress specific response under abiotic stresses in rice. Plant Sci 2009, 176:583-590.
- [51]Kilian J, Whitehead D, Horak J, Wanke D, Weinl S, Batistic O, D’Angelo C, Bornberg-Bauer E, Kudla J, Harter K: The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J 2007, 50:347-363.
- [52]Chen D, Toone MW, Mata J, Lyne R, Burns G, Kivinen K, Brazama A, Jones N, Bahler J: Global transcriptional responses of fission yeast to environmental stress. Mol Cell Biol 2003, 14:214-229.
- [53]Berry DB, Gasch AP: Stress-activated genomic expression changes serve a preparative role for impending stress in yeast. Mol Biol Cell 2008, 19:4580-4587.
- [54]Purdom CE, Hardiman PA, Bye VJ, Eno NC, Tyler CR, Sumpter JP: Estrogenic effects of effluents from sewage treatment works. Chem Ecol 1994, 8:275-285.
- [55]Larsson DGJ, Adolfsson-Erici M, Parkkonen J, Pettersson M, Berg AH, Olsson PE, Förlin L: Ethinyloestradiol - an undesired fish contraceptive? Aquat Toxicol 1999, 45:91-97.
- [56]Routledge EJ, Sheahan D, Desbrow C, Brighty GC, Waldock M, Sumpter JP: Identification of estrogenic chemicals in STW effluent. 2. In vivo responses in trout and roach. Environ Sci Technol 1998, 32:1559-1565.
- [57]Jobling S, Coey S, Whitmore JG, Kime DE, van Look KJ, McAllister BG, Beresford N, AC ACH, Brighty G, Tyler CR, Sumpter JP: Wild intersex roach (Rutilus rutilus) have reduced fertility. Biol Reprod 2002, 67:515-524.
- [58]Sumpter JP, Jobling S: Vitellogenesis as a biomarker for contamination of the aquatic environment. Environ Health Perspect 1995, 103:173-178.
- [59]Thomas-Jones E, Thorpe K, Harrison N, Thomas G, Morris C, Hutchinson T, Woodhead S, Tyler C: Dynamics of estrogen biomarker responses in rainbow trout exposed to 17β-estradiol and 17α-ethinylestradiol. Environ Toxicol Chem 2003, 22:3001-3008.
- [60]Carnevali O, Maradonna F: Exposure to xenobiotic compounds: looking for new biomarkers. Comp Endocrinol 2003, 131:203-208.
- [61]de Wit M, Keil D, van der Ven K, Vandamme S, Witters E, Coen WD: An integrated transcriptomic and proteomic approach characterizing estrogenic and metabolic effects of 17α-ethinylestradiol in zebrafish (Danio rerio). Gen Comp Endocrinol 2010, 167:190-201.
- [62]Arukwe A, Goksøyr A: Eggshell and egg yolk proteins in fish: hepatic proteins for the next generation: oogenetic, population, and evolutionary implications of endocrine disruption. Comp Hepatol 2003, 2:4. BioMed Central Full Text
- [63]Davis AP, King BL, Mockus S, Murphy CG, Saraceni-Richards C, Rosenstein M, Wiegers T, Mattingly CJ: The comparative toxicogenomics database: update 2011. Nucleic Acids Res 2011, 39:D1067-D1072.
- [64]Williams TD, Diab AM, George SG, Sabine V, Chipman JK: Gene expression responses of European flounder (Platichthys flesus) to 17-β estradiol. Toxicol Lett 2007, 168:236-48.
- [65]Geoghegan F, Katsiadaki I, Williams TD, Chipman JK: A cDNA microarray for the three-spined stickleback, Gasterosteus aculeatus L., and analysis of the interactive effects of oestradiol and dibenzanthracene exposures. J of Fish Biol 2008, 72:2133-53.
- [66]Martyniuka CJ, Gerrie ER, Popesku JT, Ekker M, Trudeau VL: Microarray analysis in the zebrafish (Danio rerio) liver and telencephalon after exposure to low concentration of 17α-ethinylestradiol. Aquat Toxicol 2007, 84:38-49.
- [67]Tilton SC, Givan SA, Pereira CB, Bailey GS, Williams DE: Toxicogenomic profiling of the hepatic tumor promoters indole-3-carbinol, 17α-estradiol and β-naphthoflavone in rainbow trout. Toxicol Sci 2006, 90:61-72.
- [68]Sárvári M, Hrabovszky E, Kalló T, Galamb O, Solymosi N, Likó T, Molnár B, Tihanyi K, Szombathelyi Z, Liposits Z: Gene expression profiling identifies key estradiol targets in the frontal cortex of the rat. Endocrinology 2010, 151:1161-1176.
- [69]Kwekel JC, Burgoon LD, Burt JW, Harkema JR, Zacharewski TR: A cross-species analysis of the rodent uterotrophic program: elucidation of conserved responses and targets of estrogen signaling. Citation Physiol Genomics 2005, 23:327-342.
- [70]Henríquez-Hernández LA, Flores-Morales A, Santana-Farré R, Axelson M, Nilsson P, Norstedt G, Fernández-Pérez L: Role of pituitary hormones on 17α-ethinylestradiol-induced cholestasis in rat. J Pharmacol Exp Ter 2007, 320:695-705.
- [71]Xu R, Li X: A comparison of parametric versus permutation methods with applications to general and temporal microarray gene expression data. Bioinformatics 2003, 19:1284-1289.
- [72]Chen F, Mackey AJ, Vermunt JK, Roos DS: Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS One 2007, 2:e383.
- [73]Kristiansson E, Sjögren A, Rudemo M, Nerman O: Quality optimised analysis of general paired microarray experiments. Stat Appl Genet Mol Biol 2006, 5:Article 10.
- [74]Klebanov L, Jordan C, Yakovlev A: A new type of stochastic dependence revealed in gene expression data. Stat Appl Genet Mol Biol 2006, 5:Article 7.
- [75]Sjögren A, Kristiansson E, Rudemo M, Nerman O: Weighted analysis of general microarray experiments. BMC Bioinformatics 2007, 8:387. BioMed Central Full Text
- [76]Forbes EV, Calow P: Extrapolation in ecological risk assessment: balancing pragmatism and precaution in chemical controls legislation. Bioscience 2002, 52:249-257.
- [77]Isaac NJB, Turvey ST, Collen B, Waterman C, Baillie JEM: Mammals on the EDGE: conservation priorities based on threat and phylogeny. PLoS One 2007, 2:e296.
- [78]Good IJ: On the weighted combination of significance tests. J Roy Statist Soc Ser B (Methodological) 1955, 17:264-265.
- [79]Bhoj DS, Schiefermayr K: Approximations to the distribution of weighted combination of independent probabilites. J Statist Comput and Simul 2008, 68:153-159.
- [80]Bolstad BM, Irizarry RA, øAstrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 2003, 19:185-193.
- [81]Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002, 30:e15.
- [82]Smythe GK: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 2004, 3:Article 3.
- [83]Alexa A, Rahnenführer J, Lengauer T: Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 2006, 22:1600-1607.