Unknown

Dataset Information

0

Constructing a disease database and using natural language processing to capture and standardize free text clinical information.


ABSTRACT: The ability to extract critical information about an infectious disease in a timely manner is critical for population health research. The lack of procedures for mining large amounts of health data is a major impediment. The goal of this research is to use natural language processing (NLP) to extract key information (clinical factors, social determinants of health) from free text. The proposed framework describes database construction, NLP modules for locating clinical and non-clinical (social determinants) information, and a detailed evaluation protocol for evaluating results and demonstrating the effectiveness of the proposed framework. The use of COVID-19 case reports is demonstrated for data construction and pandemic surveillance. The proposed approach outperforms benchmark methods in F1-score by about 1-3%. A thorough examination reveals the disease's presence as well as the frequency of symptoms in patients. The findings suggest that prior knowledge gained through transfer learning can be useful when researching infectious diseases with similar presentations in order to accurately predict patient outcomes.

SUBMITTER: Raza S 

PROVIDER: S-EPMC10215040 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Constructing a disease database and using natural language processing to capture and standardize free text clinical information.

Raza Shaina S   Schwartz Brian B  

Scientific reports 20230526 1


The ability to extract critical information about an infectious disease in a timely manner is critical for population health research. The lack of procedures for mining large amounts of health data is a major impediment. The goal of this research is to use natural language processing (NLP) to extract key information (clinical factors, social determinants of health) from free text. The proposed framework describes database construction, NLP modules for locating clinical and non-clinical (social d  ...[more]

Similar Datasets

| S-EPMC7520927 | biostudies-literature
| S-EPMC9577054 | biostudies-literature
| S-EPMC7522136 | biostudies-literature
| S-EPMC10031450 | biostudies-literature
| S-EPMC8132979 | biostudies-literature
| S-EPMC6381768 | biostudies-literature
| S-EPMC9352066 | biostudies-literature
| S-EPMC8635726 | biostudies-literature
| S-EPMC10873831 | biostudies-literature