Unknown

Dataset Information

0

BioADAPT-MRC: adversarial learning-based domain adaptation improves biomedical machine reading comprehension task.


ABSTRACT:

Motivation

Biomedical machine reading comprehension (biomedical-MRC) aims to comprehend complex biomedical narratives and assist healthcare professionals in retrieving information from them. The high performance of modern neural network-based MRC systems depends on high-quality, large-scale, human-annotated training datasets. In the biomedical domain, a crucial challenge in creating such datasets is the requirement for domain knowledge, inducing the scarcity of labeled data and the need for transfer learning from the labeled general-purpose (source) domain to the biomedical (target) domain. However, there is a discrepancy in marginal distributions between the general-purpose and biomedical domains due to the variances in topics. Therefore, direct-transferring of learned representations from a model trained on a general-purpose domain to the biomedical domain can hurt the model's performance.

Results

We present an adversarial learning-based domain adaptation framework for the biomedical machine reading comprehension task (BioADAPT-MRC), a neural network-based method to address the discrepancies in the marginal distributions between the general and biomedical domain datasets. BioADAPT-MRC relaxes the need for generating pseudo labels for training a well-performing biomedical-MRC model. We extensively evaluate the performance of BioADAPT-MRC by comparing it with the best existing methods on three widely used benchmark biomedical-MRC datasets-BioASQ-7b, BioASQ-8b and BioASQ-9b. Our results suggest that without using any synthetic or human-annotated data from the biomedical domain, BioADAPT-MRC can achieve state-of-the-art performance on these datasets.

Availability and implementation

BioADAPT-MRC is freely available as an open-source project at https://github.com/mmahbub/BioADAPT-MRC.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Mahbub M 

PROVIDER: S-EPMC9477526 | biostudies-literature | 2022 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

BioADAPT-MRC: adversarial learning-based domain adaptation improves biomedical machine reading comprehension task.

Mahbub Maria M   Srinivasan Sudarshan S   Begoli Edmon E   Peterson Gregory D GD  

Bioinformatics (Oxford, England) 20220901 18


<h4>Motivation</h4>Biomedical machine reading comprehension (biomedical-MRC) aims to comprehend complex biomedical narratives and assist healthcare professionals in retrieving information from them. The high performance of modern neural network-based MRC systems depends on high-quality, large-scale, human-annotated training datasets. In the biomedical domain, a crucial challenge in creating such datasets is the requirement for domain knowledge, inducing the scarcity of labeled data and the need  ...[more]

Similar Datasets

| S-EPMC6129312 | biostudies-literature
| S-EPMC7374333 | biostudies-literature
| S-EPMC10132130 | biostudies-literature
| S-EPMC10688971 | biostudies-literature
| S-EPMC11781752 | biostudies-literature
| S-EPMC9648428 | biostudies-literature
| S-EPMC8870515 | biostudies-literature
| S-EPMC11220966 | biostudies-literature
| S-EPMC10982812 | biostudies-literature
| S-EPMC10805953 | biostudies-literature