Dataset Information

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.

ABSTRACT:

Background

Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.

Objective

This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.

Methods

The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports.

Results

The named entity recognition implementation in the NLP layer achieves a performance gain of about 1-3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1-8% better). A thorough examination reveals the disease's presence and symptoms prevalence in patients.

Conclusions

A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases.

SUBMITTER: Raza S

PROVIDER: S-EPMC9879259 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.

Raza Shaina S Schwartz Brian B

BMC medical informatics and decision making 20230126 1

<h4>Background</h4>Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.<h4>Objective</h4>This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.<h4>Methods</h4>The proposed framework integrates a data layer ...[more]

PMID: 36703154

Dataset Information

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.

Background

Objective

Methods

Results

Conclusions

Publications

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports.
| S-EPMC4849652 | biostudies-literature

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study.
| S-EPMC8132979 | biostudies-literature

Leveraging natural language processing for efficient information extraction from breast cancer pathology reports: Single-institution study.
| S-EPMC12005671 | biostudies-literature

A Methodological Approach to Validate Pneumonia Encounters from Radiology Reports Using Natural Language Processing.
| S-EPMC9391313 | biostudies-literature

Natural language processing for automated triage and prioritization of individual case safety reports for case-by-case assessment
| S-EPMC12443081 | biostudies-literature

Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
| S-EPMC7392233 | biostudies-literature

Validation of a Natural Language Processing Algorithm for the Extraction of the Sleep Parameters from the Polysomnography Reports.
| S-EPMC9602175 | biostudies-literature

Using large language models for temporal relation extraction from pediatric clinical reports.
| S-EPMC12640238 | biostudies-literature

Novel approach by natural language processing for COVID-19 knowledge discovery.
| S-EPMC8970612 | biostudies-literature

Automated interpretation of stress echocardiography reports using natural language processing.
| S-EPMC9779789 | biostudies-literature