Unknown

Dataset Information

0

Real-time autOmatically updated data warehOuse in healThcare (ROOT): an innovative and automated data collection system.


ABSTRACT:

Background

The American Society for Clinical Oncology recently launched the minimal common oncology data elements project to facilitate cancer data interoperability. However, clinical data are often unrecorded in an organized way, and converting them into a structured format can be time-consuming. Clinical Data Warehouse (CDW) is a database that consolidates data from different clinical sources. However, the clinical data extracted from this database include not only structured data but also natural language generated during clinical practice. Therefore, applying these data to a clinical study is challenging because they are unstructured, and unformatted to allow essential content to be found. This study determined how best to organize a huge amount of clinical data to evaluate the upper aerodigestive tract cancers' clinical features and outcomes, including cancer of the head and neck, esophagus, lung, thymus, and mesothelioma.

Methods

The Real-time autOmatically updated data warehOuse in healThcare (ROOT) uses six main regions to describe the journey of cancer patients. This study, developed an algorithm optimized for each disease category using natural language processing of unstructured data and data capture of structured data. Data from patients diagnosed at the Samsung Medical Center from 2008-2020 were used.

Results

Comprehensive clinical data for 67,617 patients across six tumor types: 28,954 with non-small-cell lung cancer, 2,540 with small-cell lung cancer, 30,035 with head and neck cancer, 4,950 with esophageal cancer, 966 with thymic cancer, and 172 with mesothelioma were collected. Additionally, the results of a longitudinal molecular study, including epidermal growth factor receptor (EGFR) mutations, anaplastic lymphoma kinase (ALK) tests, and next-generation sequencing (NGS), were included. Scattered information was integrated and automatically built up to match the cohort, allowing users to capture the most updated test results and treatment outcomes.

Conclusions

This landmark study documented the successful construction of a real-time updating system for medical big data, based on the CDW program.

SUBMITTER: Jung HA 

PROVIDER: S-EPMC8577969 | biostudies-literature | 2021 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Real-time autOmatically updated data warehOuse in healThcare (ROOT): an innovative and automated data collection system.

Jung Hyun Ae HA   Jeong Oksoon O   Chang Dong Kyung DK   Park Sehhoon S   Sun Jong-Mu JM   Lee Se-Hoon SH   Ahn Jin Seok JS   Ahn Myung-Ju MJ   Park Keunchil K  

Translational lung cancer research 20211001 10


<h4>Background</h4>The American Society for Clinical Oncology recently launched the minimal common oncology data elements project to facilitate cancer data interoperability. However, clinical data are often unrecorded in an organized way, and converting them into a structured format can be time-consuming. Clinical Data Warehouse (CDW) is a database that consolidates data from different clinical sources. However, the clinical data extracted from this database include not only structured data but  ...[more]

Similar Datasets

| S-EPMC5469019 | biostudies-literature
| S-EPMC8238471 | biostudies-literature
| S-EPMC6054708 | biostudies-literature
| S-EPMC7303827 | biostudies-literature
| S-EPMC3851580 | biostudies-literature
| S-EPMC5836265 | biostudies-literature
| S-EPMC6436678 | biostudies-literature
| S-EPMC3022815 | biostudies-literature
| S-EPMC6490913 | biostudies-literature
| S-EPMC11533334 | biostudies-literature