Unknown

Dataset Information

0

Calculation of exact Shapley values for support vector machines with Tanimoto kernel enables model interpretation.


ABSTRACT: The support vector machine (SVM) algorithm is popular in chemistry and drug discovery. SVM models have black box character. Their predictions can be interpreted through feature weighting or the model-agnostic Shapley additive explanations (SHAP) formalism that locally approximates Shapley values (SVs) originating from game theory. We introduce an algorithm termed SV-expressed Tanimoto similarity (SVETA) for the exact calculation of SVs to explain SVM models employing the Tanimoto kernel, the gold standard for the assessment of molecular similarity. For a model system, the exact calculation of SVs is demonstrated. In an SVM-based compound classification task from drug discovery, only a limited correlation between exact SV and SHAP values is observed, prohibiting the use of approximate values for rationalizing predictions. For exemplary test compounds, atom-based mapping of prioritized features delineates coherent substructures that closely resemble those obtained by analyzing independently derived random forest models, thus providing consistent explanations.

SUBMITTER: Feldmann C 

PROVIDER: S-EPMC9464958 | biostudies-literature | 2022 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Calculation of exact Shapley values for support vector machines with Tanimoto kernel enables model interpretation.

Feldmann Christian C   Bajorath Jürgen J  

iScience 20220827 9


The support vector machine (SVM) algorithm is popular in chemistry and drug discovery. SVM models have black box character. Their predictions can be interpreted through feature weighting or the model-agnostic Shapley additive explanations (SHAP) formalism that locally approximates Shapley values (SVs) originating from game theory. We introduce an algorithm termed SV-expressed Tanimoto similarity (SVETA) for the exact calculation of SVs to explain SVM models employing the Tanimoto kernel, the gol  ...[more]

Similar Datasets

| S-EPMC3218421 | biostudies-literature
| S-EPMC8626003 | biostudies-literature
| S-EPMC1390438 | biostudies-literature
| S-EPMC8265031 | biostudies-literature
| S-EPMC9349278 | biostudies-literature
| S-EPMC6294291 | biostudies-literature
| S-EPMC4666603 | biostudies-literature
| S-EPMC8057672 | biostudies-literature
| S-EPMC10656054 | biostudies-literature
| S-EPMC9005067 | biostudies-literature