Unknown

Dataset Information

0

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.


ABSTRACT:

Background

Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.

Objective

This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.

Methods

The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports.

Results

The named entity recognition implementation in the NLP layer achieves a performance gain of about 1-3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1-8% better). A thorough examination reveals the disease's presence and symptoms prevalence in patients.

Conclusions

A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases.

SUBMITTER: Raza S 

PROVIDER: S-EPMC9879259 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.

Raza Shaina S   Schwartz Brian B  

BMC medical informatics and decision making 20230126 1


<h4>Background</h4>Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.<h4>Objective</h4>This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.<h4>Methods</h4>The proposed framework integrates a data layer  ...[more]

Similar Datasets

| S-EPMC4849652 | biostudies-literature
| S-EPMC10309602 | biostudies-literature
| S-EPMC8132979 | biostudies-literature
| S-EPMC9391313 | biostudies-literature
| S-EPMC3811072 | biostudies-literature
| S-EPMC7392233 | biostudies-literature
| S-EPMC9602175 | biostudies-literature
| S-EPMC8970612 | biostudies-literature
| S-EPMC9779789 | biostudies-literature
| S-EPMC10791738 | biostudies-literature