BMC Bioinformatics | |
MiningABs: mining associated biomarkers across multi-connected gene expression datasets | |
Chun-Pei Cheng1  Christopher DeBoever1  Kelly A Frazer3  Yu-Cheng Liu2  Vincent S Tseng4  | |
[1] Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, USA | |
[2] Department of Environmental and Occupational Health, National Cheng Kung University, Tainan, Taiwan | |
[3] Institute for Genomic Medicine, University of California San Diego, La Jolla, California, USA | |
[4] Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan | |
关键词: Gene expression; Combination effects; Meta-analysis; Associated biomarkers; Mining ABs; | |
Others : 818465 DOI : 10.1186/1471-2105-15-173 |
|
received in 2013-11-06, accepted in 2014-06-03, 发布年份 2014 | |
【 摘 要 】
Background
Human disease often arises as a consequence of alterations in a set of associated genes rather than alterations to a set of unassociated individual genes. Most previous microarray-based meta-analyses identified disease-associated genes or biomarkers independent of genetic interactions. Therefore, in this study, we present the first meta-analysis method capable of taking gene combination effects into account to efficiently identify associated biomarkers (ABs) across different microarray platforms.
Results
We propose a new meta-analysis approach called MiningABs to mine ABs across different array-based datasets. The similarity between paired probe sequences is quantified as a bridge to connect these datasets together. The ABs can be subsequently identified from an “improved” common logit model (c-LM) by combining several sibling-like LMs in a heuristic genetic algorithm selection process. Our approach is evaluated with two sets of gene expression datasets: i) 4 esophageal squamous cell carcinoma and ii) 3 hepatocellular carcinoma datasets. Based on an unbiased reciprocal test, we demonstrate that each gene in a group of ABs is required to maintain high cancer sample classification accuracy, and we observe that ABs are not limited to genes common to all platforms. Investigating the ABs using Gene Ontology (GO) enrichment, literature survey, and network analyses indicated that our ABs are not only strongly related to cancer development but also highly connected in a diverse network of biological interactions.
Conclusions
The proposed meta-analysis method called MiningABs is able to efficiently identify ABs from different independently performed array-based datasets, and we show its validity in cancer biology via GO enrichment, literature survey and network analyses. We postulate that the ABs may facilitate novel target and drug discovery, leading to improved clinical treatment. Java source code, tutorial, example and related materials are available at “http://sourceforge.net/projects/miningabs/ webcite”.
【 授权许可】
2014 Cheng et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140711103204554.pdf | 1200KB | download | |
Figure 9. | 26KB | Image | download |
Figure 8. | 51KB | Image | download |
Figure 7. | 72KB | Image | download |
Figure 6. | 75KB | Image | download |
Figure 5. | 44KB | Image | download |
Figure 4. | 45KB | Image | download |
Figure 3. | 112KB | Image | download |
Figure 2. | 102KB | Image | download |
Figure 1. | 95KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
【 参考文献 】
- [1]Costanzo M, Baryshnikova A, Bellay J, Kim Y, Spear ED, Sevier CS, Ding H, Koh JL, Toufighi K, Mostafavi S, Prinz J, St Onge RP, Van der Sluis B, Makhnevych T, Vizeacoumar FJ, Alizadeh S, Bahr S, Brost RL, Chen Y, Cokol M, Deshpande R, Li Z, Lin ZY, Liang W, Marback M, Paw J, San Luis BJ, Shuteriqi E, Tong AH, van Dyk N, et al.: The genetic landscape of a cell. Science 2010, 327(5964):425-431.
- [2]Tucker CL, Fields S: Lethal combinations. Nat Genet 2003, 35(3):204-205.
- [3]Han B, Park M, Chen XW: A Markov blanket-based method for detecting causal SNPs in GWAS. BMC Bioinformatics 2010, 11(Suppl 3):S5.
- [4]Liu YC, Cheng CP, Tseng VS: Discovering relational-based association rules with multiple minimum supports on microarray datasets. Bioinformatics 2011, 27(22):3142-3148.
- [5]Su H, Hu N, Yang HH, Wang C, Takikita M, Wang QH, Giffen C, Clifford R, Hewitt SM, Shou JZ, Goldstein AM, Lee MP, Taylor PR: Global gene expression profiling and validation in esophageal squamous cell carcinoma and its association with clinical phenotypes. Clin Cancer Res 2011, 17(9):2955-2966.
- [6]Hu N, Clifford RJ, Yang HH, Wang C, Goldstein AM, Ding T, Taylor PR, Lee MP: Genome wide analysis of DNA copy number neutral loss of heterozygosity (CNNLOH) and its relation to gene expression in esophageal squamous cell carcinoma. BMC Genomics 2010, 11:576.
- [7]Yan W, Shih JH, Rodriguez-Canales J, Tangrea MA, Ylaya K, Hipp J, Player A, Hu N, Goldstein AM, Taylor PR, Emmert-Buck MR, Erickson HS: Identification of unique expression signatures and therapeutic targets in esophageal squamous cell carcinoma. BMC Res Notes 2012, 5:73.
- [8]Roessler S, Long EL, Budhu A, Chen Y, Zhao X, Ji J, Walker R, Jia HL, Ye QH, Qin LX, Tang ZY, He P, Hunter KW, Thorgeirsson SS, Meltzer PS, Wang XW: Integrative genomic identification of genes on 8p associated with hepatocellular carcinoma progression and patient survival. Gastroenterology 2012, 142(4):957-966. e912
- [9]Tsuchiya M, Parker JS, Kono H, Matsuda M, Fujii H, Rusyn I: Gene expression in nontumoral liver tissue and recurrence-free survival in hepatitis C virus-positive hepatocellular carcinoma. Mol Cancer 2010, 9:74.
- [10]Tseng GC, Ghosh D, Feingold E: Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Res 2012, 40(9):3785-3799.
- [11]Chang LC, Lin HM, Sibille E, Tseng GC: Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline. BMC Bioinformatics 2013, 14:368.
- [12]LaCroix-Fralish ML, Austin JS, Zheng FY, Levitin DJ, Mogil JS: Patterns of pain: meta-analysis of microarray studies of pain. Pain 2011, 152(8):1888-1898.
- [13]Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM: Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res 2002, 62(15):4427-4433.
- [14]Olkin I, Saner H: Approximations for trimmed Fisher procedures in research synthesis. Stat Methods Med Res 2001, 10(4):267-276.
- [15]Moreau Y, Aerts S, De Moor B, De Strooper B, Dabrowski M: Comparison and meta-analysis of microarray data: from the bench to the computer desk. Trends Genet 2003, 19(10):570-577.
- [16]Lu S, Li J, Song C, Shen K, Tseng GC: Biomarker detection in the integration of multiple multi-class genomic studies. Bioinformatics 2010, 26(3):333-340.
- [17]Choi JK, Yu U, Kim S, Yoo OJ: Combining multiple microarray studies and modeling interstudy variation. Bioinformatics 2003, 19(Suppl 1):i84-i90.
- [18]Wang J, Coombes KR, Highsmith WE, Keating MJ, Abruzzo LV: Differences in gene expression between B-cell chronic lymphocytic leukemia and normal B cells: a meta-analysis of three microarray studies. Bioinformatics 2004, 20(17):3166-3178.
- [19]Hong F, Breitling R, McEntee CW, Wittner BS, Nemhauser JL, Chory J: RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis. Bioinformatics 2006, 22(22):2825-2827.
- [20]Sanford T, Chung PH, Reinish A, Valera V, Srinivasan R, Linehan WM, Bratslavsky G: Molecular sub-classification of renal epithelial tumors using meta-analysis of gene expression microarrays. PLoS One 2011, 6(7):e21260.
- [21]Moher D, Liberati A, Tetzlaff J, Altman DG, Group P: Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg 2010, 8(5):336-341.
- [22]Tian S, Krueger JG, Li K, Jabbari A, Brodmerkel C, Lowes MA, Suarez-Farinas M: Meta-analysis derived (MAD) transcriptome of psoriasis defines the “core” pathogenesis of disease. PLoS One 2012, 7(9):e44274.
- [23]Qiao X, Zhang HH, Liu Y, Todd MJ, Marron JS: Weighted Distance Weighted Discrimination and Its Asymptotic Properties. J Am Stat Assoc 2010, 105(489):401-414.
- [24]Vinterbo S, Ohno-Machado L: A genetic algorithm to select variables in logistic regression: example in the domain of myocardial infarction. Proceedings/AMIA Annual Symposium AMIA Symposium 1999, 984-988.
- [25]Gayou O, Das SK, Zhou SM, Marks LB, Parda DS, Miften M: A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes. Med Phys 2008, 35(12):5426-5433.
- [26]van Hagen P, Hulshof MC, van Lanschot JJ, Steyerberg EW, van Berge Henegouwen MI, Wijnhoven BP, Richel DJ, Nieuwenhuijzen GA, Hospers GA, Bonenkamp JJ, Cuesta MA, Blaisse RJ, Busch OR, ten Kate FJ, Creemers GJ, Punt CJ, Plukker JT, Verheul HM, Spillenaar Bilgen EJ, van Dekken H, van der Sangen MJ, Rozema T, Biermann K, Beukema JC, Piet AH, van Rij CM, Reinders JG, Tilanus HW, van der Gaast A, Cross Group: Preoperative chemoradiotherapy for esophageal or junctional cancer. N Engl J Med 2012, 366(22):2074-2084.
- [27]Pennathur A, Gibson MK, Jobe BA, Luketich JD: Oesophageal carcinoma. Lancet 2013, 381(9864):400-412.
- [28]Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D: Global cancer statistics. CA Cancer J Clin 2011, 61(2):69-90.
- [29]Drummond A, Strimmer K: PAL: an object-oriented programming library for molecular evolution and phylogenetics. Bioinformatics 2001, 17(7):662-663.
- [30]Goode M, Strimmer K, Drummond A, Buckler E, Rodrigo A: A Brief Introduction to the Phylogenetic Analysis Library Version 1.5. In Proceedings of the Second Conference on Asia-Pacific Bioinformatics - Volume 29 . Dunedin, New Zealand. 976544: Australian Computer Society, Inc; 2004:175-179.
- [31]Holland JH: Adaptation in natural and artificial systems. Cambridge, MA, USA: MIT Press; 1992.
- [32]Berthold M, Cebron N, Dill F, Gabriel T, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B: KNIME: The Konstanz Information Miner. In Data Analysis, Machine Learning and Applications. Edited by Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R. Springer Berlin Heidelberg; 2008:319-326.
- [33]Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 2003, 4(5):3.
- [34]Liu YC, Cheng CP, Tseng VS: Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC Bioinformatics 2013, 14:230.
- [35]Cheng CP, Tsai YL, Tseng VS: CTGR-Span: Efficient Mining of Cross-Timepoint Gene Regulation Sequential Patterns from Microarray Datasets. Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on: 4–7 Oct. 2012 2012, 1-4.
- [36]Cheng CP, Liu YC, Tsai YL, Tseng VS: An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets. BMC Bioinformatics 2013, 14(Suppl 12):S3.
- [37]Itan Y, Zhang SY, Vogt G, Abhyankar A, Herman M, Nitschke P, Fried D, Quintana-Murci L, Abel L, Casanova JL: The human gene connectome as a map of short cuts for morbid allele discovery. Proc Natl Acad Sci U S A 2013, 110(14):5558-5563.
- [38]Su H, Hu N, Shih J, Hu Y, Wang QH, Chuang EY, Roth MJ, Wang C, Goldstein AM, Ding T, Dawsey SM, Giffen C, Emmert-Buck MR, Taylor PR: Gene expression analysis of esophageal squamous cell carcinoma reveals consistent molecular profiles related to a family history of upper gastrointestinal cancer. Cancer Res 2003, 63(14):3872-3876.
- [39]Dooley TP, Reddy SP, Wilborn TW, Davis RL: Biomarkers of human cutaneous squamous cell carcinoma from tissues and cell lines identified by DNA microarrays and qRT-PCR. Biochem Biophys Res Commun 2003, 306(4):1026-1036.
- [40]Zhang X, Lin P, Zhu ZH, Long H, Wen J, Yang H, Zhang X, Wang DF, Fu JH, Fang Y, Rong TH: Expression profiles of early esophageal squamous cell carcinoma by cDNA microarray. Cancer Genet Cytogenet 2009, 194(1):23-29.
- [41]Sun H, Chua MS, Yang D, Tsalenko A, Peter BJ, So S: Antibody arrays identify potential diagnostic markers of hepatocellular carcinoma. Biomark Insights 2008, 3:1-18.
- [42]Sakamoto M, Mori T, Masugi Y, Effendi K, Rie I, Du W: Candidate molecular markers for histological diagnosis of early hepatocellular carcinoma. Intervirology 2008, 51(Suppl 1):42-45.
- [43]Minguez B, Lachenmayer A: Diagnostic and prognostic molecular markers in hepatocellular carcinoma. Dis Markers 2011, 31(3):181-190.
- [44]Chen H, Jia WD, Li JS, Wang W, Xu GL, Ma JL, Ren WH, Ge YS, Yu JH, Liu WB, Zhang CH, Wang YC: Extracellular matrix protein 1, a novel prognostic factor, is associated with metastatic potential of hepatocellular carcinoma. Med Oncol 2011, 28(Suppl 1):S318-S325.