| Journal of Cheminformatics | |
| Benchmarks for interpretation of QSAR models | |
| Pavel Polishchuk1  Mariia Matveieva1  | |
| [1] Institute of Molecular and Translational Medicine, Faculty of Medicine and Dentistry, Palacky University, University Hospital in Olomouc, Hnevotinska 5, 77900, Olomouc, Czech Republic; | |
| 关键词: QSAR model interpretation; Benchmark data set; Synthetic data set; Interpretability metrics; Atom contributions; Graph convolutional neural networks; | |
| DOI : 10.1186/s13321-021-00519-x | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
Interpretation of QSAR models is useful to understand the complex nature of biological or physicochemical processes, guide structural optimization or perform knowledge-based validation of QSAR models. Highly predictive models are usually complex and their interpretation is non-trivial. This is particularly true for modern neural networks. Various approaches to interpretation of these models exist. However, it is difficult to evaluate and compare performance and applicability of these ever-emerging methods. Herein, we developed several benchmark data sets with end-points determined by pre-defined patterns. These data sets are purposed for evaluation of the ability of interpretation approaches to retrieve these patterns. They represent tasks with different complexity levels: from simple atom-based additive properties to pharmacophore hypothesis. We proposed several quantitative metrics of interpretation performance. Applicability of benchmarks and metrics was demonstrated on a set of conventional models and end-to-end graph convolutional neural networks, interpreted by the previously suggested universal ML-agnostic approach for structural interpretation. We anticipate these benchmarks to be useful in evaluation of new interpretation approaches and investigation of decision making of complex “black box” models.
【 授权许可】
CC BY
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202107075600071ZK.pdf | 3485KB |
PDF