Journal of Clinical Bioinformatics | |
mCOPA: analysis of heterogeneous features in cancer expression data | |
Melissa J Davis2  Mark A Ragan2  Colleen C Nelson1  Stefan R Maetschke2  Alperen Taciroglu2  Chenwei Wang1  | |
[1] Australian Prostate Cancer Research Centre – Queensland, Queensland University of Technology, Brisbane, 4102, Australia;Institute for Molecular Bioscience, The University of Queensland, Brisbane, 4072, Australia | |
关键词: Feature selection; Percentile; Bioinformatics; Heterogeneous; Subtype; Cluster; Expression profile; Expression data; Outliers; Cancer; | |
Others : 804267 DOI : 10.1186/2043-9113-2-22 |
|
received in 2012-10-19, accepted in 2012-12-03, 发布年份 2012 | |
【 摘 要 】
Background
Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset.
Results
We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours.
Conclusions
We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers.
【 授权许可】
2012 Wang et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140708055449247.pdf | 1172KB | download | |
Figure 4. | 94KB | Image | download |
Figure 3. | 27KB | Image | download |
Figure 2. | 59KB | Image | download |
Figure 1. | 22KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun XW, Varambally S, Cao XH, Tchinda J, Kuefer R, Lee C, Montie JE, Shah RB, Pienta KJ, Rubin MA, Chinnaiyan AM: Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 2005, 310:644-648.
- [2]Rhodes DR, Kalyana-Sundaram S, Mahavisno V, Varambally R, Yu JJ, Briggs BB, Barrette TR, Anstet MJ, Kincead-Beal C, Kulkarni P, Varambally S, Ghoshy D, Chinnaiyan AM: Oncomine 3.0: Genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia 2007, 9:166-180.
- [3]MacDonald JW, Ghosh D: COPA- cancer outlier profile analysis. Bioinformatics 2006, 22:2950-2951.
- [4]Davis MJ, Shin CJ, Jing N, Ragan MA: Rewiring the dynamic interactome. Mol Biosyst 2012, 8:2054-2066.
- [5]Madhamshettiwar P, Maetschke S, Davis M, Reverter A, Ragan M: Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets. Genome Med 2012, 4:41. BioMed Central Full Text
- [6]Inder KL, Zheng YZ, Davis MJ, Moon H, Loo D, Hien N, Clements JA, Parton RG, Foster LJ, Hill MM: Expression of PTRF in PC-3 cells modulates cholesterol dynamics and the actin cytoskeleton impacting secretion pathways. Mol Cell Proteomics 2012, 11:1-13.
- [7]Li L, Chaudhuri A, Chant J, Tang ZJ: PADGE: analysis of heterogeneous patterns of differential gene expression. Physiol Genomics 2007, 32:154-159.
- [8]Wu BL: Cancer outlier differential gene expression detection. Biostatistics 2007, 8:566-575.
- [9]Tibshirani R, Hastie T: Outlier sums for differential gene expression analysis. Biostatistics 2007, 8:2-8.
- [10]Chen LA, Chen DT, Chan WY: The distribution-based p-value for the outlier sum in differential gene expression analysis. Biometrika 2010, 97:246-253.
- [11]Gleiss A, Sanchez-Cabo F, Perco P, Tong D, Heinze G: Adaptive trimmed t-statistics for identifying predominantly high expression in a microarray experiment. Stat Med 2011, 30:52-61.
- [12]MacQueen J: Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability 1967, 1:281-297.
- [13]Kaufman L, Rousseeuw PJ: Finding groups in data: an introduction to cluster analysis. Hoboken, New Jersey: Wiley; 2005.
- [14]Chang F, Qiu WL, Zamar RH, Lazarus R, Wang XG: Clues: an R package for nonparametric clustering based on local shrinking. J Stat Softw 2010, 33:1-16.
- [15]Tomlins SA, Mehra R, Rhodes DR, Cao XH, Wang L, Dhanasekaran SM, Kalyana-Sundaram S, Wei JT, Rubin MA, Pienta KJ, Shah RB, Chinnaiyan AM: Integrative molecular concept modeling of prostate cancer progression. Nature Genet 2007, 39:41-51.
- [16]Mazzocchi M: Statistics for marketing and consumer research. United Kingdom: SAGE Publications; 2008.
- [17]Biewenga P, Buist MR, Moerland PD, Ver Loren Van Themaat E, Van Kampen AH, Ten Kate FJ, Baas F: Gene expression in early stage cervical cancer. Gynecol Oncol 2008, 108:520-526.
- [18]Riker AI, Enkemann SA, Fodstad O, Liu S, Ren S, Morris C, Xi Y, Howell P, Metge B, Samant RS, Shevde LA, Li W, Eschrich S, Daud A, Ju J, Matta J: The gene expression profiles of primary and metastatic melanoma yields a transition point of tumor progression and metastasis. BMC Med Genomics 2008, 1:13. BioMed Central Full Text
- [19]Kort EJ, Farber L, Tretiakova M, Petillo D, Furge KA, Yang XJ, Cornelius A, Teh BT: The E2F3-Oncomir-1 axis is activated in Wilms' tumor. Cancer Res 2008, 68:4034-4038.
- [20]Yusenko MV, Kuiper RP, Boethe T, Ljungberg B, van Kessel AG, Kovacs G: High-resolution DNA copy number and gene expression analyses distinguish chromophobe renal cell carcinomas and renal oncocytomas. BMC Cancer 2009, 9:152. BioMed Central Full Text
- [21]Dodd LE, Sengupta S, Chen IH, den Boon JA, Cheng YJ, Westra W, Newton MA, Mittl BF, McShane L, Chen CJ, Ahlquist P, Hildesheim A: Genes involved in DNA repair and nitrosamine metabolism and those located on chromosome 14q32 are dysregulated in nasopharyngeal carcinoma. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 2006, 15:2216-2225.
- [22]Brune V, Tiacci E, Pfeil I, Doring C, Eckerle S, van Noesel CJ, Klapper W, Falini B, von Heydebreck A, Metzler D, Brauninger A, Hansmann ML, Kuppers R: Origin and pathogenesis of nodular lymphocyte-predominant Hodgkin lymphoma as revealed by global gene expression analysis. J Exp Med 2008, 205:2251-2268.
- [23]Jones J, Otu H, Spentzos D, Kolia S, Inan M, Beecken WD, Fellbaum C, Gu X, Joseph M, Pantuck AJ, Jonas D, Libermann TA: Gene signatures of progression and metastasis in renal cell cancer. Clinical cancer research: an official journal of the American Association for Cancer Research 2005, 11:5730-5739.
- [24]Grzmil M, Morin P Jr, Lino MM, Merlo A, Frank S, Wang Y, Moncayo G, Hemmings BA: MAP kinase-interacting kinase 1 regulates SMAD2-dependent TGF-beta signaling pathway in human glioblastoma. Cancer Res 2011, 71:2392-2402.
- [25]Cuadros M, Cano C, Lopez FJ, Lopez-Castro R, Concha A: Expression profiling of breast tumors based on human epidermal growth factor receptor 2 status defines migration-related genes. Pathobiology: journal of immunopathology, molecular and cellular biology 2013, 80:32-40.
- [26]Rabellino A, Carter B, Konstantinidou G, Wu SY, Rimessi A, Byers LA, Heymach JV, Girard L, Chiang CM, Teruya-Feldstein J, Scaglioni PP: The sumo e3-ligase PIAS1 regulates the tumor suppressor PML and its oncogenic counterpart PML-RARA. Cancer Res 2012, 72:2275-2284.
- [27]van Staveren WC, Beeckman S, Tomas G, Dom G, Hebrant A, Delys L, Vliem MJ, Tresallet C, Andry G, Franc B, Libert F, Dumont JE, Maenhaut C: Role of Epac and protein kinase A in thyrotropin-induced gene expression in primary thyrocytes. Exp Cell Res 2012, 318:444-452.
- [28]ArrayExpress. http://www.ebi.ac.uk/arrayexpress webcite
- [29]GEO. http://www.ncbi.nlm.nih.gov/geo/ webcite
- [30]Gautier L, Cope L, Bolstad BM, Irizarry RA: Affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 2004, 20:307-315.
- [31]Smyth GK: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 2004, 3:Article3.
- [32]R package cluster. http://cran.r-project.org/web/packages/cluster/cluster.pdf webcite
- [33]Hubert L, Arabie P: Comparing partitions. J Classif 1985, 2:193-218.
- [34]Kruskal WH, Wallis WA: Use of ranks in one-criterion variance analysis. J Am Stat Assoc 1952, 47:583-621.
- [35]R package Mclust. http://cran.r-project.org/web/packages/mclust/mclust.pdf webcite
- [36]R package pgirmess. http://cran.r-project.org/web/packages/pgirmess/pgirmess.pdf webcite
- [37]mCOPA website. http://www.bioinformatics.org.au/mCOPA webcite
- [38]GenePattern. http://genepattern.broadinstitute.org webcite
- [39]Bosco A, Ehteshami S, Stern DA, Martinez FD: Decreased activation of inflammatory networks during acute asthma exacerbations is associated with chronic airflow obstruction. Mucosal Immunol 2010, 3:399-409.
- [40]Krishnamoorthy A, Ajay AK, Hoffmann D, Kim T-M, Ramirez V, Campanholle G, Bobadilla NA, Waikar SS, Vaidya VS: Fibrinogen β–derived Bβ15-42 peptide protects against kidney ischemia/ reperfusion injury. Blood 2011, 118:1934-1942.
- [41]da Huang W, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 2009, 4:44-57.
- [42]da Huang W, Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 2009, 37:1-13.
- [43]Di Lorenzo G, Tortora G, D'Armiento FP, De Rosa G, Staibano S, Autorino R, D'Armiento M, De Laurentiis M, De Placido S, Catalano G, Bianco AR, Ciardiello F: Expression of epidermal growth factor receptor correlates with disease relapse and progression to androgen-independence in human prostate cancer. Clin Cancer Res 2002, 8:3438-3444.
- [44]Di Lorenzo G, Bianco R, Tortora G, Ciardiello F: Involvement of growth factor receptors of the epidermal growth factor receptor family in prostate cancer development and progression to androgen independence. Clin Genitourin Cancer 2003, 2:50-57.
- [45]Buttyan R, Sawczuk IS, Benson MC, Siegal JD, Olsson CA: Enhanced expression of the c-Myc protooncogene in high-grade human prostate cancers. Prostate 1987, 11:327-337.
- [46]Abate-Shen C, Shen MM: Molecular genetics of prostate cancer. Genes Dev 2000, 14:2410-2434.
- [47]Hicks DG, Short SM, Prescott NL, Tarr SM, Coleman KA, Yoder BJ, Crowe JP, Choueiri TK, Dawson AE, Budd GT, Tubbs RR, Casey G, Weil RJ: Breast cancers with brain metastases are more likely to be estrogen receptor negative, express the basal cytokeratin CK5/6, and overexpress HER2 or EGFR. Am J Surg Pathol 2006, 30:1097-1104. 1010.1097/1001.pas.0000213306.0000205811.b0000213309
- [48]Cobleigh MA, Vogel CL, Tripathy D, Robert NJ, Scholl S, Fehrenbacher L, Wolter JM, Paton V, Shak S, Lieberman G, Slamon DJ: Multinational study of the efficacy and safety of humanized anti-HER2 monoclonal antibody in women who have HER2-overexpressing metastatic breast cancer that has progressed after chemotherapy for metastatic disease. J Clin Oncol 1999, 17:2639.
- [49]Andl CD, Mizushima T, Oyama K, Bowser M, Nakagawa H, Rustgi AK: EGFR-induced cell migration is mediated predominantly by the JAK-STAT pathway in primary esophageal keratinocytes. Am J Physiol Gastrointest Liver Physiol 2004, 287:G1227-G1237.
- [50]Akatsuka T, Wada T, Kokai Y, Kawaguchi S, Isu K, Yamashiro K, Yamashita T, Sawada N, Yamawaki S, Ishii S: ErbB2 expression is correlated with increased survival of patients with osteosarcoma. Cancer 2002, 94:1397-1404.
- [51]Carver BS, Tran J, Gopalan A, Chen Z, Shaikh S, Carracedo A, Alimonti A, Nardella C, Varmeh S, Scardino PT, Cordon-Cardo C, Gerald W, Pandolfi PP: Aberrant ERG expression cooperates with loss of PTEN to promote cancer progression in the prostate. Nature Genet 2009, 41:619-624.
- [52]Frattini M, Saletti P, Romagnani E, Martin V, Molinari F, Ghisletta M, Camponovo A, Etienne LL, Cavalli F, Mazzucchelli L: PTEN loss of expression predicts cetuximab efficacy in metastatic colorectal cancer patients. Br J Cancer 2007, 97:1139-1145.
- [53]Tamura M, Gu J, Tran H, Yamada KM: PTEN gene and integrin signaling in cancer. J Natl Cancer Inst 1999, 91:1820-1828.
- [54]Tamguney T, Stokoe D: New insights into PTEN. J Cell Sci 2007, 120:4071-4079.
- [55]Green MR: Targeting targeted therapy. N Engl J Med 2004, 350:2191-2193.
- [56]Weinstein IB, Joe AK: Mechanisms of Disease: oncogene addiction[mdash]a rationale for molecular targeting in cancer therapy. Nat Clin Prac Oncol 2006, 3:448-457.
- [57]Ronde J, Hannemann J, Halfwerk H, Mulder L, Straver M, Vrancken Peeters M-JFD, Wesseling J, Vijver M, Wessels LA, Rodenhuis S: Concordance of clinical and molecular breast cancer subtyping in the context of preoperative chemotherapy response. Breast Cancer Res Treat 2010, 119:119-126.
- [58]Ravasi T, Suzuki H, Cannistraci CV, Katayama S, Bajic VB, Tan K, Akalin A, Schmeier S, Kanamori-Katayama M, Bertin N, Carninci P, Daub CO, Forrest ARR, Gough J, Grimmond S, Han JH, Hashimoto T, Hide W, Hofmann O, Kawaji H, Kubosaki A, Lassmann T, van Nimwegen E, Ogawa C, Teasdale RD, Tegner J, Lenhard B, Teichmann SA, Arakawa T, Ninomiya N, et al.: An atlas of combinatorial transcriptional regulation in mouse and man. Cell 2010, 140:744-752.