Unknown

Dataset Information

0

Img2Mol - accurate SMILES recognition from molecular graphical depictions.


ABSTRACT: The automatic recognition of the molecular content of a molecule's graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous vector space of fixed size (latent representation) with low reconstruction errors. In this paper, we present a fast and accurate model combining deep convolutional neural network learning from molecule depictions and a pre-trained decoder that translates the latent representation into the SMILES representation of the molecules. This combination allows us to precisely infer a molecular structure from an image. Our rigorous evaluation shows that Img2Mol is able to correctly translate up to 88% of the molecular depictions into their SMILES representation. A pretrained version of Img2Mol is made publicly available on GitHub for non-commercial users.

SUBMITTER: Clevert DA 

PROVIDER: S-EPMC8565361 | biostudies-literature | 2021 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Img2Mol - accurate SMILES recognition from molecular graphical depictions.

Clevert Djork-Arné DA   Le Tuan T   Winter Robin R   Montanari Floriane F  

Chemical science 20210929 42


The automatic recognition of the molecular content of a molecule's graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous vector space of fixed size (latent representation) with low reconstruction errors. In this paper, we present a fast and accurate model combining deep convolutional neural network learning from molecule depic  ...[more]

Similar Datasets

| S-EPMC11525969 | biostudies-literature
| S-EPMC3325157 | biostudies-literature
| S-EPMC6660760 | biostudies-literature
| S-EPMC6873550 | biostudies-literature
| S-EPMC4908353 | biostudies-literature
| S-EPMC5856202 | biostudies-literature
| S-EPMC3381051 | biostudies-literature
| S-EPMC7914444 | biostudies-literature
| S-EPMC8695687 | biostudies-literature