Dataset Information


Deep scaffold hopping with multimodal transformer neural networks

ABSTRACT: Scaffold hopping is a central task of modern medicinal chemistry for rational drug design, which aims to design molecules of novel scaffolds sharing similar target biological activities toward known hit molecules. Traditionally, scaffolding hopping depends on searching databases of available compounds that can't exploit vast chemical space. In this study, we have re-formulated this task as a supervised molecule-to-molecule translation to generate hopped molecules novel in 2D structure but similar in 3D structure, as inspired by the fact that candidate compounds bind with their targets through 3D conformations. To efficiently train the model, we curated over 50 thousand pairs of molecules with increased bioactivity, similar 3D structure, but different 2D structure from public bioactivity database, which spanned 40 kinases commonly investigated by medicinal chemists. Moreover, we have designed a multimodal molecular transformer architecture by integrating molecular 3D conformer through a spatial graph neural network and protein sequence information through Transformer. The trained DeepHop model was shown able to generate around 70% molecules having improved bioactivity together with high 3D similarity but low 2D scaffold similarity to the template molecules. This ratio was 1.9 times higher than other state-of-the-art deep learning methods and rule- and virtual screening-based methods. Furthermore, we demonstrated that the model could generalize to new target proteins through fine-tuning with a small set of active compounds. Case studies have also shown the advantages and usefulness of DeepHop in practical scaffold hopping scenarios.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13321-021-00565-5.

PROVIDER: S-EPMC8590293 | BioStudies |

REPOSITORIES: biostudies

Similar Datasets

| S-EPMC5511031 | BioStudies
| S-EPMC6842385 | BioStudies
| S-EPMC2703967 | BioStudies
| S-EPMC6312796 | BioStudies
| S-EPMC3180241 | BioStudies
| S-EPMC6409521 | BioStudies
| S-EPMC4025776 | BioStudies
| S-EPMC8596819 | BioStudies
| S-EPMC7339533 | BioStudies
| S-EPMC6225167 | BioStudies