Unknown

Dataset Information

0

Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor.


ABSTRACT: Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistry, resulting in such workflows failing for a biased subset of mentions. This paper presents a single model that performs at close to the state-of-the-art for both organic (CHEMDNER, 89.7 F1 score) and inorganic (Matscholar, 88.0 F1 score) NER tasks at the same time. Our NER system utilizing the Bert architecture is available as part of ChemDataExtractor 2.1, along with the data sets and scripts used to train the model.

SUBMITTER: Isazawa T 

PROVIDER: S-EPMC9049593 | biostudies-literature | 2022 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor.

Isazawa Taketomo T   Cole Jacqueline M JM  

Journal of chemical information and modeling 20220224 5


Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistry, resulting in such workflows failing for a biased subset of mentions. This paper presents a single model that performs at close to the state-of-the-art for <i>both</i> organic (CHEMDNER, 89.7 F1 score) and inorganic  ...[more]

Similar Datasets

| S-EPMC3967102 | biostudies-literature
| S-EPMC10272703 | biostudies-literature
| S-EPMC10293979 | biostudies-literature
| S-EPMC4331696 | biostudies-literature
| S-EPMC10495977 | biostudies-literature
| S-EPMC7014657 | biostudies-literature
| S-EPMC9375066 | biostudies-literature
| S-EPMC6956779 | biostudies-literature
| S-EPMC11373323 | biostudies-literature
| S-EPMC3066171 | biostudies-literature