BMC Genomics | |
Identification of biomarkers that distinguish chemical contaminants based on gene expression profiles | |
Edward J Perkins3  Chaoyang Zhang4  Choo Y Ang5  David R Johnson1  Xin Guan5  Youping Deng2  Junmei Ai2  Xiaomou Wei2  | |
[1] Conestoga-Rovers & Associates, 2290 Springlake Road, Suite 108, Dallas, TX 75234, USA;Department of Internal Medicine, Rush University Cancer Center, Rush University Medical Center, Kidston House, 630 S. Hermitage Ave. Room 408, Chicago, IL 60612, USA;US Army Engineer Research and Development Center, 3909 Halls Ferry Road, Vicksburg, MS 39180, USA;School of Computing, University of Southern Mississippi, Hattiesburg, MS 39406, USA;SpecPro Inc, Vicksburg, MS 39180, USA | |
关键词: Classification; Chemical; Hepatocytes; Microarray; Biomarker; | |
Others : 1217594 DOI : 10.1186/1471-2164-15-248 |
|
received in 2013-06-30, accepted in 2014-03-11, 发布年份 2014 | |
【 摘 要 】
Background
High throughput transcriptomics profiles such as those generated using microarrays have been useful in identifying biomarkers for different classification and toxicity prediction purposes. Here, we investigated the use of microarrays to predict chemical toxicants and their possible mechanisms of action.
Results
In this study, in vitro cultures of primary rat hepatocytes were exposed to 105 chemicals and vehicle controls, representing 14 compound classes. We comprehensively compared various normalization of gene expression profiles, feature selection and classification algorithms for the classification of these 105 chemicals into14 compound classes. We found that normalization had little effect on the averaged classification accuracy. Two support vector machine (SVM) methods, LibSVM and sequential minimal optimization, had better classification performance than other methods. SVM recursive feature selection (SVM-RFE) had the highest overfitting rate when an independent dataset was used for a prediction. Therefore, we developed a new feature selection algorithm called gradient method that had a relatively high training classification as well as prediction accuracy with the lowest overfitting rate of the methods tested. Analysis of biomarkers that distinguished the 14 classes of compounds identified a group of genes principally involved in cell cycle function that were significantly downregulated by metal and inflammatory compounds, but were induced by anti-microbial, cancer related drugs, pesticides, and PXR mediators.
Conclusions
Our results indicate that using microarrays and a supervised machine learning approach to predict chemical toxicants, their potential toxicity and mechanisms of action is practical and efficient. Choosing the right feature and classification algorithms for this multiple category classification and prediction is critical.
【 授权许可】
2014 Wei et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150707100352417.pdf | 1565KB | download | |
Figure 9. | 94KB | Image | download |
Figure 8. | 92KB | Image | download |
Figure 7. | 252KB | Image | download |
Figure 6. | 158KB | Image | download |
Figure 5. | 64KB | Image | download |
Figure 4. | 72KB | Image | download |
Figure 3. | 161KB | Image | download |
Figure 2. | 72KB | Image | download |
Figure 1. | 66KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
【 参考文献 】
- [1]Collins FS, Gray GM, Bucher JR: Toxicology. Transforming environmental health protection. Science 2008, 319:906-907.
- [2]Kola I, Landis J: Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov 2004, 3:711-715.
- [3]Huang R, Southall N, Xia M, Cho MH, Jadhav A, Nguyen DT, Inglese J, Tice RR, Austin CP: Weighted feature significance: a simple, interpretable model of compound toxicity based on the statistical enrichment of structural features. Toxicol Sci 2009, 112:385-393.
- [4]Judson R, Richard A, Dix DJ, Houck K, Martin M, Kavlock R, Dellarco V, Henry T, Holderman T, Sayre P: The toxicity data landscape for environmental chemicals. Environ Health Perspect 2009, 117:685-695.
- [5]Brown VJ: REACHing for chemical safety. Environ Health Perspect 2003, 111:A766-A769.
- [6]Edwards SW, Preston RJ: Systems biology and mode of action based risk assessment. Toxicol Sci 2008, 106:312-318.
- [7]Bulera SJ, Eddy SM, Ferguson E, Jatkoe TA, Reindel JF, Bleavins MR, De L I: RNA expression in the early characterization of hepatotoxicants in Wistar rats by high-density DNA microarrays. Hepatology 2001, 33:1239-1258.
- [8]Hamadeh HK, Bushel PR, Jayadev S, DiSorbo O, Bennett L, Li L, Tennant R, Stoll R, Barrett JC, Paules RS: Prediction of compound signature using high density gene expression profiling. Toxicol Sci 2002, 67:232-240.
- [9]Thomas RS, Rank DR, Penn SG, Zastrow GM, Hayes KR, Pande K, Glover E, Silander T, Craven MW, Reddy JK: Identification of toxicologically predictive gene sets using cDNA microarrays. Mol Pharmacol 2001, 60:1189-1194.
- [10]Waring JF, Jolly RA, Ciurlionis R, Lum PY, Praestgaard JT, Morfitt DC, Buratto B, Roberts C, Schadt E, Ulrich RG: Clustering of hepatotoxins based on mechanism of toxicity using gene expression profiles. Toxicol Appl Pharmacol 2001, 175:28-42.
- [11]Waring JF, Ciurlionis R, Jolly RA, Heindel M, Ulrich RG: Microarray analysis of hepatotoxins in vitro reveals a correlation between gene expression profiles and mechanisms of toxicity. Toxicol Lett 2001, 120:359-368.
- [12]Steiner G, Suter L, Boess F, Gasser R, de Vera MC, Albertini S, Ruepp S: Discriminating different classes of toxicants by transcript profiling. Environ Health Perspect 2004, 112:1236-1248.
- [13]Goetz AK, Dix DJ: Mode of action for reproductive and hepatic toxicity inferred from a genomic study of triazole antifungals. Toxicol Sci 2009, 110:449-462.
- [14]Mathijs K, Brauers KJ, Jennen DG, Boorsma A, van Herwijnen MH, Gottschalk RW, Kleinjans JC, van Delft JH: Discrimination for genotoxic and nongenotoxic carcinogens by gene expression profiling in primary mouse hepatocytes improves with exposure time. Toxicol Sci 2009, 112:374-384.
- [15]Thomas RS, Bao W, Chu TM, Bessarabova M, Nikolskaya T, Nikolsky Y, Andersen ME, Wolfinger RD: Use of short-term transcriptional profiles to assess the long-term cancer-related safety of environmental and industrial chemicals. Toxicol Sci 2009, 112:311-321.
- [16]Hallen K, Bjorkegren J, Tegner J: Detection of compound mode of action by computational integration of whole-genome measurements and genetic perturbations. BMC Bioinformatics 2006, 7:51. BioMed Central Full Text
- [17]Buck WR, Waring JF, Blomme EA: Use of traditional end points and gene dysregulation to understand mechanisms of toxicity: toxicogenomics in mechanistic toxicology. Methods Mol Biol 2008, 460:23-44.
- [18]Blomme EA, Yang Y, Waring JF: Use of toxicogenomics to understand mechanisms of drug-induced hepatotoxicity during drug discovery and development. Toxicol Lett 2009, 186:22-31.
- [19]Hultin-Rosenberg L, Jagannathan S, Nilsson KC, Matis SA, Sjogren N, Huby RD, Salter AH, Tugwood JD: Predictive models of hepatotoxicity using gene expression data from primary rat hepatocytes. Xenobiotica 2006, 36:1122-1139.
- [20]Zidek N, Hellmann J, Kramer PJ, Hewitt PG: Acute hepatotoxicity: a predictive model based on focused illumina microarrays. Toxicol Sci 2007, 99:289-302.
- [21]Pirooznia M, Yang JY, Yang MQ, Deng Y: A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics 2008, 9(Suppl 1):S13. BioMed Central Full Text
- [22]Statnikov A, Tsamardinos I, Dosbayev Y, Aliferis CF: GEMS: a system for automated cancer diagnosis and biomarker discovery from microarray gene expression data. Int J Med Inform 2005, 74:491-503.
- [23]Statnikov A, Aliferis CF, Tsamardinos I, Hardin D, Levy S: A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 2005, 21:631-643.
- [24]Schwarzer G, Vach W, Schumacher M: On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. Stat Med 2000, 19:541-561.
- [25]Deng Y, Meyer SA, Guan X, Escalon BL, Ai J, Wilbanks MS, Welti R, Garcia-Reyero N, Perkins EJ: Analysis of common and specific mechanisms of liver function affected by nitrotoluene compounds. PLoS One 2011, 6:e14662.
- [26]Deng Y, Johnson DR, Guan X, Ang CY, Ai J, Perkins EJ: In vitro gene regulatory networks predict in vivo function of liver. BMC Syst Biol 2010, 4:153. BioMed Central Full Text
- [27]Chowbina S, Deng Y, Ai J, Wu X, Guan X, Wilbanks MS, Escalon BL, Meyer SA, Perkins EJ, Chen JY: A new approach to construct pathway connected networks and its application in dose responsive gene expression profiles of rat liver regulated by 2,4DNT. BMC Genomics 2010, 11(Suppl 3):S4. BioMed Central Full Text
- [28]Blumenthal RD, Goldenberg DM: Methods and goals for the use of in vitro and in vivo chemosensitivity testing. Mol Biotechnol 2007, 35:185-197.
- [29]Navas JM, Segner H: Vitellogenin synthesis in primary cultures of fish liver cells as endpoint for in vitro screening of the (anti)estrogenic activity of chemical substances. Aquat Toxicol 2006, 80:1-22.
- [30]Lusa L, McShane LM, Reid JF, De CL, Ambrogi F, Biganzoli E, Gariboldi M, Pierotti MA: Challenges in projecting clustering results across gene expression-profiling datasets. J Natl Cancer Inst 2007, 99:1715-1723.
- [31]Lusa L, McShane LM, Radmacher MD, Shih JH, Wright GW, Simon R: Appropriateness of some resampling-based inference procedures for assessing performance of prognostic classifiers derived from microarray data. Stat Med 2007, 26:1102-1113.
- [32]Ayroles JF, Gibson G: Analysis of variance of microarray data. Methods Enzymol 2006, 411:214-233.
- [33]Jiang H, Deng Y, Chen HS, Tao L, Sha Q, Chen J, Tsai CJ, Zhang S: Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes. BMC Bioinformatics 2004, 5:81. BioMed Central Full Text
- [34]Perkins EJ, Bao W, Guan X, Ang CY, Wolfinger RD, Chu TM, Meyer SA, Inouye LS: Comparison of transcriptional responses in liver tissue and primary hepatocyte cell cultures after exposure to hexahydro-1, 3, 5-trinitro-1, 3, 5-triazine. BMC Bioinformatics 2006, 7(Suppl 4):S22. BioMed Central Full Text
- [35]Pirooznia M, Deng Y: SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data. BMC Bioinformatics 2006, 7(Suppl 4):S25. BioMed Central Full Text
- [36]Liu Q, Sung AH, Qiao M, Chen Z, Yang JY, Yang MQ, Huang X, Deng Y: Comparison of feature selection and classification for MALDI-MS data. BMC Genomics 2009, 10(Suppl 1):S3. BioMed Central Full Text
- [37]Zucknick M, Richardson S, Stronach EA: Comparing the characteristics of gene expression profiles derived by univariate and multivariate classification methods. Stat Appl Genet Mol Biol 2008, 7:Article7.
- [38]Shevade SK, Keerthi SS: A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 2003, 19:2246-2253.
- [39]Zhang X, Liu X, Li H, Li R, Xu M, Fan Y, Ling Y: Congenital unilateral malformations of lung referred as bronchial foreign bodies. Lin Chuang Er Bi Yan Hou Ke Za Zhi 2006, 20:972-974.
- [40]Zhou X, Tuck DP: MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data. Bioinformatics 2007, 23:1106-1114.
- [41]Bielza C, Robles V, Larranaga P: Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers. Methods Inf Med 2009, 48:236-241.
- [42]Han X: Improving gene expression cancer molecular pattern discovery using nonnegative principal component analysis. Genome Inform 2008, 21:200-211.
- [43]Meacock SC, Swann BP, Dawson W: The dynamics and possible roles of metal complexes in inflammation. Agents Actions Suppl 1981, 8:145-164.
- [44]Milanino R, Conforti A, Franco L, Marrella M, Velo G: Copper and inflammation–a possible rationale for the pharmacological manipulation of inflammatory disorders. Agents Actions 1985, 16:504-513.
- [45]Forte G, Petrucci F, Bocca B: Metal allergens of growing significance: epidemiology, immunotoxicology, strategies for testing and prevention. Inflamm Allergy Drug Targets 2008, 7:145-162.
- [46]Llamazares S, Moreira A, Tavares A, Girdham C, Spruce BA, Gonzalez C, Karess RE, Glover DM, Sunkel CE: Polo encodes a protein kinase homolog required for mitosis in Drosophila. Genes Dev 1991, 5:2153-2165.
- [47]Joyce D, Albanese C, Steer J, Fu M, Bouzahzah B, Pestell RG: NF-kappaB and cell-cycle regulation: the cyclin connection. Cytokine Growth Factor Rev 2001, 12:73-90.
- [48]Xiao G, Rabson AB, Young W, Qing G, Qu Z: Alternative pathways of NF-kappaB activation: a double-edged sword in health and disease. Cytokine Growth Factor Rev 2006, 17:281-293.
- [49]Flister MJ, Wilber A, Hall KL, Iwata C, Miyazono K, Nisato RE, Pepper MS, Zawieja DC, Ran S: Inflammation induces lymphangiogenesis through upregulation of VEGFR-3 mediated by NF-{kappa}B and Prox1. Blood 2009, 115:418-429.
- [50]Frank E, Hall M, Trigg L, Holmes G, Witten IH: Data mining in bioinformatics using Weka. Bioinformatics 2004, 20:2479-2481.
- [51]Diaz-Uriarte R, de Alvarez AS: Gene selection and classification of microarray data using random forest. BMC Bioinforma 2006, 7:3. BioMed Central Full Text
- [52]Chang CC, Lin CJ: Training nu-support vector classifiers: theory and algorithms. Neural Comput 2001, 13:2119-2147.
- [53]Pirooznia M, Habib T, Perkins EJ, Deng Y: GOfetcher: a database with complex searching facility for gene ontology. Bioinformatics 2008, 24:2561-2563.
- [54]Bredel M, Bredel C, Juric D, Harsh GR, Vogel H, Recht LD, Sikic BI: Functional network analysis reveals extended gliomagenesis pathway maps and three novel MYC-interacting genes in human gliomas. Cancer Res 2005, 65:8679-8689.