Chemistry Central Journal | |
Development and experimental test of support vector machines virtual screening method for searching Src inhibitors from large compound libraries | |
Bucong Han3  Xiaohua Ma3  Ruiying Zhao1  Jingxian Zhang3  Xiaona Wei3  Xianghui Liu3  Xin Liu3  Cunlong Zhang2  Chunyan Tan2  Yuyang Jiang2  Yuzong Chen3  | |
[1] Central Research Institute of China Chemical Science and Technology, 20 Xueyuan Road, Haidian District, Beijing, 100083, People’s Republic of China | |
[2] The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China | |
[3] Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore | |
关键词: Support vector machine; Virtual screening; Kinase inhibitor; Computer aided drug design; c-src; Src; | |
Others : 788016 DOI : 10.1186/1752-153X-6-139 |
|
received in 2012-07-13, accepted in 2012-11-07, 发布年份 2012 |
【 摘 要 】
Background
Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates.
Results
We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors.
Conclusions
SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates.
【 授权许可】
2012 Han et al.; licensee Chemistry Central Ltd.
【 参考文献 】
- [1]Brunton VG, Frame MC: Src and focal adhesion kinase as therapeutic targets in cancer. Curr Opin Pharmacol 2008, 8:427-432.
- [2]Gill AL, Verdonk M, Boyle RG, Taylor R: A comparison of physicochemical property profiles of marketed oral drugs and orally bioavailable anti-cancer protein kinase inhibitors in clinical development. Curr Top Med Chem 2007, 7:1408-1422.
- [3]Lee D, Gautschi O: Clinical development of SRC tyrosine kinase inhibitors in lung cancer. Clin Lung Cancer 2006, 7:381-384.
- [4]Hiscox S, Nicholson RI: Src inhibitors in breast cancer therapy. Expert Opin Ther Targets 2008, 12:757-767.
- [5]Lin LG, Xie H, Li HL, Tong LJ, Tang CP, Ke CQ, Liu QF, Lin LP, Geng MY, Jiang H, et al.: Naturally occurring homoisoflavonoids function as potent protein tyrosine kinase inhibitors by c-Src-based high-throughput screening. J Med Chem 2008, 51:4419-4429.
- [6]Lee K, Kim J, Jeong KW, Lee KW, Lee Y, Song JY, Kim MS, Lee GS, Kim Y: Structure-based virtual screening of Src kinase inhibitors. Bioorg Med Chem 2009, 17:3152-3161.
- [7]Farard J, Lanceart G, Loge C, Nourrisson MR, Cruzalegui F, Pfeiffer B, Duflos M: Design, synthesis and evaluation of new 6-substituted-5-benzyloxy-4-oxo-4H-pyran-2-carboxamides as potential Src inhibitors. J Enzyme Inhib Med Chem 2008, 23:629-640.
- [8]Alfaro-Lopez J, Yuan W, Phan BC, Kamath J, Lou Q, Lam KS, Hruby VJ: Discovery of a novel series of potent and selective substrate-based inhibitors of p60c-src protein tyrosine kinase: conformational and topographical constraints in peptide design. J Med Chem 1998, 41:2252-2260.
- [9]Chen P, Doweyko AM, Norris D, Gu HH, Spergel SH, Das J, Moquin RV, Lin J, Wityak J, Iwanowicz EJ, et al.: Imidazoquinoxaline Src-family kinase p56Lck inhibitors: SAR, QSAR, and the discovery of (S)-N-(2-chloro-6-methylphenyl)-2-(3-methyl-1-piperazinyl)imidazo- [1,5-a]pyrido[3,2-e]pyrazin-6-amine (BMS-279700) as a potent and orally active inhibitor with excellent in vivo antiinflammatory activity. J Med Chem 2004, 47:4517-4529.
- [10]Shoichet BK: Virtual screening of chemical libraries. Nature 2004, 432:862-865.
- [11]Ghosh S, Nie A, An J, Huang Z: Structure-based virtual screening of chemical libraries for drug discovery. Curr Opin Chem Biol 2006, 10:194-202.
- [12]Li H, Yap CW, Ung CY, Xue Y, Li ZR, Han LY, Lin HH, Chen YZ: Machine learning approaches for predicting compounds that interact with therapeutic and ADMET related proteins. J Pharm Sci 2007, 96:2838-2860.
- [13]Han LY, Ma XH, Lin HH, Jia J, Zhu F, Xue Y, Li ZR, Cao ZW, Ji ZL, Chen YZ: A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor. J Mol Graph Model 2008, 26:1276-1286.
- [14]Jorissen RN, Gilson MK: Virtual screening of molecular databases using a support vector machine. J Chem Inf Model 2005, 45:549-561.
- [15]Lepp Z, Kinoshita T, Chuman H: Screening for new antidepressant leads of multiple activities by support vector machines. J Chem Inf Model 2006, 46:158-167.
- [16]Glick M, Jenkins JL, Nettles JH, Hitchings H, Davies JW: Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and laplacian-modified naive bayesian classifiers. J Chem Inf Model 2006, 46:193-200.
- [17]Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A: New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model 2006, 46:462-470.
- [18]Ma XH, Wang R, Yang SY, Li ZR, Xue Y, Wei YC, Low BC, Chen YZ: Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds. J Chem Inf Model 2008, 48:1227-1237.
- [19]Mayer D, Leisch F, Hornik K: The support vector machine under test. Neurocomputing 2003, 55:169-186.
- [20]Verdonk ML, Berdini V, Hartshorn MJ, Mooij WT, Murray CW, Taylor RD, Watson P: Virtual screening using protein-ligand docking: avoiding artificial enrichment. J Chem Inf Comput Sci 2004, 44:793-806.
- [21]Huang N, Shoichet BK, Irwin JJ: Benchmarking sets for molecular docking. J Med Chem 2006, 49:6789-6801.
- [22]Altmann E, Missbach M, Green J, Susa M, Wagenknecht HA, Widler L: 7-Pyrrolidinyl- and 7-piperidinyl-5-aryl-pyrrolo[2,3-d]pyrimidines–potent inhibitors of the tyrosine kinase c-Src. Bioorg Med Chem Lett 2001, 11:853-856.
- [23]Widler L, Green J, Missbach M, Susa M, Altmann E: 7-Alkyl- and 7-cycloalkyl-5-aryl-pyrrolo[2,3-d]pyrimidines–potent inhibitors of the tyrosine kinase c-Src. Bioorg Med Chem Lett 2001, 11:849-852.
- [24]Missbach M, Altmann E, Widler L, Susa M, Buchdunger E, Mett H, Meyer T, Green J: Substituted 5,7-diphenyl-pyrrolo[2,3d]pyrimidines: potent inhibitors of the tyrosine kinase c-Src. Bioorg Med Chem Lett 2000, 10:945-949.
- [25]Klutchko SR, Hamby JM, Boschelli DH, Wu Z, Kraker AJ, Amar AM, Hartl BG, Shen C, Klohs WD, Steinkampf RW, et al.: 2-Substituted aminopyrido[2,3-d]pyrimidin-7(8H)-ones. structure-activity relationships against selected tyrosine kinases and in vitro and in vivo anticancer activity. J Med Chem 1998, 41:3276-3292.
- [26]Noronha G, Barrett K, Boccia A, Brodhag T, Cao J, Chow CP, Dneprovskaia E, Doukas J, Fine R, Gong X, et al.: Discovery of [7-(2,6-dichlorophenyl)-5-methylbenzo [1,2,4]triazin-3-yl]-[4-(2-pyrrolidin-1-ylethoxy)phenyl]amine–a potent, orally active Src kinase inhibitor with anti-tumor activity in preclinical assays. Bioorg Med Chem Lett 2007, 17:602-608.
- [27]Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK: BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 2007, 35:D198-201.
- [28]Keseru GM, Makara GM: The influence of lead discovery strategies on the properties of drug candidates. Nat Rev Drug Discov 2009, 8:203-212.
- [29]Keseru GM, Makara GM: Hit discovery and hit-to-lead approaches. Drug Discov Today 2006, 11:741-748.
- [30]Bocker A, Schneider G, Teckentrup A: NIPALSTREE: a new hierarchical clustering approach for large compound libraries and its application to virtual screening. J Chem Inf Model 2006, 46:2220-2229.
- [31]Oprea TI, Gottfries J: Chemography: the art of navigating in chemical space. J Comb Chem 2001, 3:157-166.
- [32]Reymond TFJ-L: Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery. J Chem Inf Model 2007, 47:342-353.
- [33]Koch MA, Schuffenhauer A, Scheck M, Wetzel S, Casaulta M, Odermatt A, Ertl P, Waldmann H: Charting biologically relevant chemical space: a structural classification of natural products (SCONP). Proc Natl Acad Sci USA 2005, 102:17272-17277.
- [34]Kinoshita K, Kobayashi T, Asoh K, Furuichi N, Ito T, Kawada H, Hara S, Ohwada J, Hattori K, Miyagi T, et al.: 9-substituted 6,6-dimethyl-11-oxo-6,11-dihydro-5H-benzo[b]carbazoles as highly selective and potent anaplastic lymphoma kinase inhibitors. J Med Chem 2011, 54:6286-6294.
- [35]Schmidt S, Preu L, Lemcke T, Totzke F, Schachtele C, Kubbutat MH, Kunick C: Dual IGF-1R/SRC inhibitors based on a N'-aroyl-2-(1H-indol-3-yl)-2-oxoacetohydrazide structure. Eur J Med Chem 2011, 46:2759-2769.
- [36]Crew AP, Bhagwat SV, Dong H, Bittner MA, Chan A, Chen X, Coate H, Cooke A, Gokhale PC, Honda A, et al.: Imidazo[1,5-a]pyrazines: orally efficacious inhibitors of mTORC1 and mTORC2. Bioorg Med Chem Lett 2011, 21:2092-2097.
- [37]Pevet I, Brule C, Tizot A, Gohier A, Cruzalegui F, Boutin JA, Goldstein S: Synthesis and pharmacological evaluation of thieno[2,3-b]pyridine derivatives as novel c-Src inhibitors. Bioorg Med Chem 2011, 19:2517-2528.
- [38]Guagnano V, Furet P, Spanka C, Bordas V, Le Douget M, Stamm C, Brueggen J, Jensen MR, Schnell C, Schmid H, et al.: Discovery of 3-(2,6-dichloro-3,5-dimethoxy-phenyl)-1-{6-[4-(4-ethyl-piperazin-1-yl)-phenylamin o]-pyrimidin-4-yl}-1-methyl-urea (NVP-BGJ398), a potent and selective inhibitor of the fibroblast growth factor receptor family of receptor tyrosine kinase. J Med Chem 2011, 54:7066-7083.
- [39]Kumar A, Ahmad I, Chhikara BS, Tiwari R, Mandal D, Parang K: Synthesis of 3-phenylpyrazolopyrimidine-1,2,3-triazole conjugates and evaluation of their Src kinase inhibitory and anticancer activities. Bioorg Med Chem Lett 2011, 21:1342-1346.
- [40]Fang H, Tong W, Shi LM, Blair R, Perkins R, Branham W, Hass BS, Xie Q, Dial SL, Moland CL, Sheehan DM: Structure-activity relationships for a large diverse set of natural, synthetic, and environmental estrogens. Chem Res Toxicol 2001, 14:280-294.
- [41]Tong W, Xie Q, Hong H, Shi L, Fang H, Perkins R: Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity. Environ Health Perspect 2004, 112:1249-1254.
- [42]Jacobs MN: In silico tools to aid risk assessment of endocrine disrupting chemicals. Toxicology 2004, 205:43-53.
- [43]Hu JY, Aizawa T: Quantitative structure-activity relationships for estrogen receptor binding affinity of phenolic chemicals. Water Res 2003, 37:1213-1222.
- [44]Byvatov E, Fechner U, Sadowski J, Schneider G: Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J Chem Inf Comput Sci 2003, 43:1882-1889.
- [45]Doniger S, Hofman T, Yeh J: Predicting CNS Permeability of Drug Molecules:Comparison of Neural Network and Support Vector Machine Algorithms. J Comput Biol 2002, 9:849-864.
- [46]He L, Jurs PC, Custer LL, Durham SK, Pearl GM: Predicting the Genotoxicity of Polycyclic Aromatic Compounds from Molecular Structure with Different Classifiers. Chem Res Toxicol 2003, 16:1567-1580.
- [47]Snyder RD, Pearl GS, Mandakas G, Choy WN, Goodsaid F, Rosenblum IY: Assessment of the sensitivity of the computational programs DEREK, TOPKAT, and MCASE in the prediction of the genotoxicity of pharmaceutical molecules. Environ Mol Mutagen 2004, 43:143-158.
- [48]Xue Y, Li ZR, Yap CW, Sun LZ, Chen X, Chen YZ: Effect of Molecular Descriptor Feature Selection in Support Vector Machine Classification of Pharmacokinetic and Toxicological Properties of Chemical Agents. J Chem Inf Comput Sci 2004, 44:1630-1638.
- [49]Yap CW, Cai CZ, Xue Y, Chen YZ: Prediction of torsade-causing potential of drugs by support vector machine approach. Toxicol Sci 2004, 79:170-177.
- [50]Yap CW, Chen YZ: Quantitative Structure-Pharmacokinetic Relationships for drug distribution properties by using general regression neural network. J Pharm Sci 2005, 94:153-168.
- [51]Zernov VV, Balakin KV, Ivaschenko AA, Savchuk NP, Pletnev IV: Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. J Chem Inf Comput Sci 2003, 43:2048-2056.
- [52]Xue Y, Yap CW, Sun LZ, Cao ZW, Wang JF, Chen YZ: Prediction of P-glycoprotein substrates by a support vector machine approach. J Chem Inf Comput Sci 2004, 44:1497-1505.
- [53]Todeschini R, Consonni V: Handbook of Molecular Descriptors. Weinheim: Wiley-VCH; 2000.
- [54]Miller KJ: Additive methods in molecular polarizability. J Am Chem Soc 1990, 112:8533-8542.
- [55]Schultz HP: Topological organic chemistry. 1. graph theory and topological indices of alkanes. J Chem Inf Comput Sci 1989, 29:227-228.
- [56]Hall LH, Kier LB: Electrotopological state indices for atom types: a novel combination of electronic, topological and valence state information. J Chem Inf Comput Sci 1995, 35:1039-1045.
- [57]Vapnik VN: The nature of statistical learning theory. New York: Springer; 1995.
- [58]Burges CJC: A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 1998, 2:127-167.
- [59]Pochet N, De Smet F, Suykens JA, De Moor BL: Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction. Bioinformatics 2004, 20:3185-3195.
- [60]Li F, Yang Y: Analysis of recursive gene selection approaches from microarray data. Bioinformatics 2005, 21:3741-3747.
- [61]Cui LYH J, Lin HH, Zhang HL, Tang ZQ, Zheng CJ, Cao ZW, Chen YZ: Prediction of MHC-Binding Peptides of Flexible Lengths from Sequence-Derived Structural and Physicochemical Properties. Mol Immunol 2007, 44:866-877.
- [62]Yap CW, Chen YZ: Prediction of cytochrome P450 3A4, 2D6, and 2C9 inhibitors and substrates by using support vector machines. J Chem Inf Model 2005, 45:982-992.
- [63]Grover II, Singh II, Bakshi II: Quantitative structure–property relationships in pharmaceutical research - Part 2. Pharm Sci Technol Today 2000, 3:50-57.
- [64]Trotter MWB, Buxton BF, Holden SB: Support vector machines in combinatorial chemistry. Meas Control 2001, 34:235-239.
- [65]Burbidge R, Trotter M, Buxton B, Holden S: Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem 2001, 26:5-14.
- [66]Czerminski R, Yasri A, Hartsough D: Use of support vector machine in pattern classification: Application to QSAR studies. Quant Struct-Act Rel 2001, 20:227-240.
- [67]Matthews BW: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975, 405:442-451.
- [68]Willett P: Chemical similarity searching. J Chem Inf Comput Sci 1998, 38:983-996.
- [69]Bostrom J, Hogner A, Schmitt S: Do structurally similar ligands bind in a similar fashion? J Med Chem 2006, 49:6716-6725.
- [70]Johnson RA, Wichern DW: Applied multivariate statistical analysis. Englewood Cliffs, NJ: Prentice Hall; 1982.
- [71]Specht DF: Probabilistic neural networks. Neural Netw 1990, 3:109-118.
- [72]Parzen E: On estimation of a probability density function and mode. Ann Math Stat 1962, 33:1065-1076.
- [73]Cacoullos T: Estimation of a multivariate density. Ann I Stat Math 1966, 18:179-189.
- [74]Chen B, Harrison RF, Papadatos G, Willett P, Wood DJ, Lewell XQ, Greenidge P, Stiefl N: Evaluation of machine-learning methods for ligand-based virtual screening. J Comput Aided Mol Des 2007, 21:53-62.
- [75]Liew CY, Ma XH, Liu X, Yap CW: SVM Model for Virtual Screening of Lck Inhibitors. J Chem Inf Model 2009, 4:877-885.
- [76]Briem H, Gunther J: Classifying "kinase inhibitor-likeness" by using machine-learning methods. Chembiochem 2005, 6:558-566.
- [77]Ma XH, Jia J, Zhu F, Xue Y, Li ZR, Chen YZ: Comparative analysis of machine learning methods in ligand based virtual screening of large compound libraries. Comb Chem High Throughput Screen 2009, 12:344-357.
- [78]Yamane S, Ishida S, Hanamoto Y, Kumagai K, Masuda R, Tanaka K, Shiobara N, Yamane N, Mori T, Juji T, et al.: Proinflammatory role of amphiregulin, an epidermal growth factor family member whose expression is augmented in rheumatoid arthritis patients. J Inflamm (Lond) 2008, 5:5. BioMed Central Full Text
- [79]Chiu YC, Fong YC, Lai CH, Hung CH, Hsu HC, Lee TS, Yang RS, Fu WM, Tang CH: Thrombin-induced IL-6 production in human synovial fibroblasts is mediated by PAR1, phospholipase C, protein kinase C alpha, c-Src, NF-kappa B and p300 pathway. Mol Immunol 2008, 45:1587-1599.
- [80]Paniagua RT, Sharpe O, Ho PP, Chan SM, Chang A, Higgins JP, Tomooka BH, Thomas FM, Song JJ, Goodman SB, et al.: Selective tyrosine kinase inhibition by imatinib mesylate for the treatment of autoimmune arthritis. J Clin Invest 2006, 116:2633-2642.
- [81]Carvalho JF, Blank M, Shoenfeld Y: Vascular endothelial growth factor (VEGF) in autoimmune diseases. J Clin Immunol 2007, 27:246-256.
- [82]Daouti S, Latario B, Nagulapalli S, Buxton F, Uziel-Fusi S, Chirn GW, Bodian D, Song C, Labow M, Lotz M, et al.: Development of comprehensive functional genomic screens to identify novel mediators of osteoarthritis. Osteoarthritis Cartilage 2005, 13:508-518.
- [83]Remmers EF, Sano H, Wilder RL: Platelet-derived growth factors and heparin-binding (fibroblast) growth factors in the synovial tissue pathology of rheumatoid arthritis. Semin Arthritis Rheum 1991, 21:191-199.
- [84]Meyn MA 3rd, Smithgall TE: Small molecule inhibitors of Lck: the search for specificity within a kinase family. Mini Rev Med Chem 2008, 8:628-637.
- [85]Rivera J, Olivera A: Src family kinases and lipid mediators in control of allergic inflammation. Immunol Rev 2007, 217:255-268.
- [86]Lee JH, Kim JW, Ko NY, Mun SH, Kim do K, Kim JD, Won HS, Shin HS, Kim HS, Her E, et al.: Mast cell-mediated allergic response is suppressed by Sophorae flos: inhibition of SRC-family kinase. Exp Biol Med (Maywood) 2008, 233:1271.
- [87]Callera GE, Montezano AC, Yogi A, Tostes RC, He Y, Schiffrin EL, Touyz RM: c-Src-dependent nongenomic signaling responses to aldosterone are increased in vascular myocytes from spontaneously hypertensive rats. Hypertension 2005, 46:1032-1038.
- [88]Metcalf CA 3rd, van Schravendijk MR, Dalgarno DC, Sawyer TK: Targeting protein kinases for bone disease: discovery and development of Src inhibitors. Curr Pharm Des 2002, 8:2049-2075.
- [89]Shakespeare WC, Wang Y, Bohacek R, Keenan T, Sundaramoorthi R, Metcalf C 3rd, Dilauro A, Roeloffzen S, Liu S, Saltmarsh J, et al.: SAR of carbon-linked, 2-substituted purines: synthesis and characterization of AP23451 as a novel bone-targeted inhibitor of Src tyrosine kinase with in vivo anti-resorptive activity. Chem Biol Drug Des 2008, 71:97-105.
- [90]Tsuruno S, Kawaguchi SY, Hirano T: Src-family protein tyrosine kinase negatively regulates cerebellar long-term depression. Neurosci Res 2008, 61:329-332.
- [91]Vidal D, Thormann M, Pons M: A novel search engine for virtual screening of very large databases. J Chem Inf Model 2006, 46:836-843.
- [92]Stiefl N, Zaliani A: A knowledge-based weighting approach to ligand-based virtual screening. J Chem Inf Model 2006, 46:587-596.
- [93]Rella M, Rushworth CA, Guy JL, Turner AJ, Langer T, Jackson RM: Structure-based pharmacophore design and virtual screening for novel angiotensin converting enzyme 2 inhibitors. J Chem Inf Model 2006, 46:708-716.