BMC Bioinformatics | |
Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster | |
Zihua Yang2  Zhengrong Yang1  | |
[1] College of Life and Environmental Sciences, Exeter University, Stocker Road, Exeter, EX4 4QD, UK | |
[2] Wolfson Institute for Preventive Medicine, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK | |
关键词: Microarray; Differentially expressed genes; Outlier; Cancer; | |
Others : 1087954 DOI : 10.1186/1471-2105-14-81 |
|
received in 2012-04-12, accepted in 2013-02-14, 发布年份 2013 | |
【 摘 要 】
Background
Heterogeneously and differentially expressed genes (hDEG) are a common phenomenon due to bio-logical diversity. A hDEG is often observed in gene expression experiments (with two experimental conditions) where it is highly expressed in a few experimental samples, or in drug trial experiments for cancer studies with drug resistance heterogeneity among the disease group. These highly expressed samples are called outliers. Accurate detection of outliers among hDEGs is then desirable for dis- ease diagnosis and effective drug design. The standard approach for detecting hDEGs is to choose the appropriate subset of outliers to represent the experimental group. However, existing methods typically overlook hDEGs with very few outliers.
Results
We present in this paper a simple algorithm for detecting hDEGs by sequentially testing for potential outliers with respect to a tight cluster of non- outliers, among an ordered subset of the experimental samples. This avoids making any restrictive assumptions about how the outliers are distributed. We use simulated and real data to illustrate that the proposed algorithm achieves a good separation between the tight cluster of low expressions and the outliers for hDEGs.
Conclusions
The proposed algorithm assesses each potential outlier in relation to the cluster of potential outliers without making explicit assumptions about the outlier distribution. Simulated examples and and breast cancer data sets are used to illustrate the suitability of the proposed algorithm for identifying hDEGs with small numbers of outliers.
【 授权许可】
2013 Yang and Yang; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150117061340230.pdf | 1888KB | download | |
Figure 6. | 69KB | Image | download |
Figure 5. | 150KB | Image | download |
Figure 4. | 127KB | Image | download |
Figure 3. | 84KB | Image | download |
Figure 2. | 71KB | Image | download |
Figure 1. | 35KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
【 参考文献 】
- [1]Ebina M, Martínez A, Birrer M, Linnoila R: In situ detection of unexpected patterns of mutant p53 gene expression in non-small cell lung cancers. Oncogene 2001, 20:2579-2586.
- [2]Ezzat S, Smyth H, Ramyar L, Asa S: Heterogenous in vivo and in vitro expression of basic fibroblast growth factor by human pituitary adenomas. J Clin Endocrinol Metab 1995, 80:878-884.
- [3]Hess G, Rose P, Gamm H, Papadileris S, Huber C, Seliger B: Molecular analysis of the erythropoietin receptor system in patients with polycythaemia vera. Br J Haematol 1994, 88:794-802.
- [4]Knaust E, Porwit-MacDonald A, Gruber A, Xu D, Peterson C: Heterogeneity of isolated mononuclear cells from patients with acute myeloid leukemia affects cellular accumulation and efflux of daunorubicin. Haematologica 2000, 85(2):124-132.
- [5]Miyachi H, Takemura Y, Yonekura S, Komatsuda M, Nagao T, Arimori S, Ando Y, et al.: MDR1 (multidrug resistance) gene expression in adult acute leukemia: correlations with blast phenotype. Int J Hematol 1993, 57:31-37.
- [6]Nakayama T, Watanabe M, Suzuki H, Toyota M, Sekita N, Hirokawa Y, Mizokami A, Ito H, Yatani R, Shiraishi T: Epigenetic regulation of androgen receptor gene expression in human prostate cancers. Lab Invest 2000, 80:1789-1796.
- [7]Suzuki M, Hurd Y, Sokoloff P, Schwartz J, Sedvall G: D3 dopamine receptor mRNA is widely expressed in the human brain. Brain Res 1998, 779:58-74.
- [8]Wani G, Wani A, MD’Ambrosio S, et al.: Cell type-specific expression of the O6-alkylguanine-DNA alkyltransferase gene in normal human liver tissues as revealed by in situ hybridization. Carcinogenesis 1993, 14:737-741.
- [9]Tomlins S, Rhodes D, Perner S, Dhanasekaran S, Mehra R, Sun X, Varambally S, Cao X, Tchinda J, Kuefer R, et al.: Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 2005, 310:644-648.
- [10]Tibshirani R, Hastie T: Outlier sums for differential gene expression analysis. Biostatistics 2007, 8:2-8.
- [11]Wu B: Cancer outlier differential gene expression detection. Biostatistics 2007, 8:566-575.
- [12]Lian H: MOST: detecting cancer differential gene expression. Biostatistics 2008, 9:411-418.
- [13]Wang Y, Rekaya R: LSOSS: detection of cancer outlier differential gene expression. Biomarker Insights 2010, 5:69-78.
- [14]Boverhof D, Burgoon L, Williams K, Zacharewski T: Inhibition of estrogen-mediated uterine gene expression responses by dioxin. Mol Pharmacol 2008, 73:82-93.
- [15]Cattaneo M, Lotti L, Martino S, Cardano M, Orlandi R, Mariani-Costantini R, Biunno I: Functional characterization of two secreted SEL1L isoforms capable of exporting unassembled substrate. J Biol Chem 2009, 284:11405-11415.
- [16]Hensen E, De Herdt M, Goeman J, Oosting J, Smit V, Cornelisse C, De Jong R: Gene-expression of metastasized versus non-metastasized primary head and neck squamous cell carcinomas: a pathway-based analysis. BMC Cancer 2008, 8:168. BioMed Central Full Text
- [17]Hoque M, Kim M, Ostrow K, Liu J, Wisman G, Park H, Poeta M, Jeronimo C, Henrique R, Lendvai Á, et al.: Genome-wide promoter analysis uncovers portions of the cancer methylome. Cancer Res 2008, 68:2661-2670.
- [18]Iwao-Koizumi K, Matoba R, Ueno N, Kim S, Ando A, Miyoshi Y, Maeda E, Noguchi S, Kato K: Prediction of docetaxel response in human breast cancer by gene expression profiling. J Clin Oncol 2005, 23:422-431.
- [19]Missiaglia E, Blaveri E, Terris B, Wang Y, Costello E, Neoptolemos J, Crnogorac-Jurcevic T, Lemoine N: Analysis of gene expression in cancer cell lines identifies candidate markers for pancreatic tumorigenesis and metastasis. Int J Cancer 2004, 112:100-112.
- [20]Smeets A, Daemen A, Vanden Bempt I, Gevaert O, Claes B, Wildiers H, Drijkoningen R, Van Hummelen P, Lambrechts D, De Moor B, et al.: Prediction of lymph node involvement in breast cancer from primary tumor tissue using gene expression profiling and miRNAs. Breast Cancer Res Treat 2011, 129:767-776.
- [21]Smid M, Wang Y, Klijn J, Sieuwerts A, Zhang Y, Atkins D, Martens J, Foekens J: Genes associated with breast cancer metastatic to bone. J Clin Oncol 2006, 24:2261-2267.
- [22]Sun P, Gao L, Han S: Prediction of human disease-related gene clusters by clustering analysis. Int J Biol Sci 2011, 7:61-73.
- [23]Sun C, Huo D, Southard C, Nemesure B, Hennis A, Cristina Leske M, Wu S, Witonsky D, Di Rienzo A, Olopade O: A signature of balancing selection in the region upstream to the human UGT2B4 gene and implications for breast cancer risk. Human Genet 2011, 130:767-75.
- [24]Bernardo J, Smith A, Berliner M: Bayesian Theory. New York: Wiley; 1994.
- [25]Bishop C: Pattern Recognition and Machine Learning. New York: Springer; 2006.
- [26]Matthews B, et al.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta 1975, 405:442-451.
- [27]Baldi P, Brunak S, Chauvin Y, Andersen C, Nielsen H: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000, 16:412-424.
- [28]McNeil H, Barbara J: The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 1982, 143:29-36.
- [29]Tripathi A, King C, de la Morenas A, Perry V, Burke B, Antoine G, Hirsch E, Kavanah M, Mendez J, Stone M, et al.: Gene expression abnormalities in histologically normal breast epithelium of breast cancer patients. Int J Cancer 2008, 122:1557-1566.