期刊论文

【摘要】

AbstractThis paper demonstrates how a Neural Grammar Network learns to classify and score molecules for a variety of tasks in chemistry and toxicology. In addition to a more detailed analysis on datasets previously studied, we introduce three new datasets (BBB, FXa, and toxicology) to show the generality of the approach. A new experimental methodology is developed and applied to both the new datasets as well as previously studied datasets. This methodology is rigorous and statistically grounded, and ultimately culminates in a Wilcoxon significance test that proves the effectiveness of the system. We further include a complete generalization of the specific technique to arbitrary grammars and datasets using a mathematical abstraction that allows researchers in different domains to apply the method to their own work.BackgroundOur work can be viewed as an alternative to existing methods to solve the quantitative structure-activity relationship (QSAR) problem. To this end, we review a number approaches both from a methodological and also a performance perspective. In addition to these approaches, we also examined a number of chemical properties that can be used by generic classifier systems, such as feed-forward artificial neural networks. In studying these approaches, we identified a set of interesting benchmark problem sets to which many of the above approaches had been applied. These included: ACE, AChE, AR, BBB, BZR, Cox2, DHFR, ER, FXa, GPB, Therm, and Thr. Finally, we developed our own benchmark set by collecting data on toxicology.ResultsOur results show that our system performs better than, or comparatively to, the existing methods over a broad range of problem types. Our method does not require the expert knowledge that is necessary to apply the other methods to novel problems.ConclusionsWe conclude that our success is due to the ability of our system to: 1) encode molecules losslessly before presentation to the learning system, and 2) leverage the design of molecular description languages to facilitate the identification of relevant structural attributes of the molecules over different problem domains.

【授权许可】

CC BY
© Kremer et al; licensee BioMed Central Ltd. 2010

【预览】

附件列表
Files	Size	Format	View
RO202311108344500ZK.pdf	674KB	PDF	download

【参考文献】

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]

BMC Bioinformatics
Classifying and scoring of molecules with the NGN: new datasets, significance tests, and generalization
Research
Eddie YT Ma¹ Stefan C Kremer² Christopher JF Cameron²
[1] Department of Biology at the University of Waterloo, Waterloo, Ontario, Canada;School of Computer Science at the University of Guelph, Guelph, Ontario, Canada;
关键词: Support Vector Machine; Probabilistic Neural Network; Parse Tree; Input String; Grammar Rule;
DOI : 10.1186/1471-2105-11-S8-S4
来源: Springer
PDF


	文献评价指标
	下载次数：7次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【参考文献】