Unknown

Dataset Information

0

A statistical quality assessment method for longitudinal observations in electronic health record data with an application to the VA million veteran program.


ABSTRACT:

Background

To describe an automated method for assessment of the plausibility of continuous variables collected in the electronic health record (EHR) data for real world evidence research use.

Methods

The most widely used approach in quality assessment (QA) for continuous variables is to detect the implausible numbers using prespecified thresholds. In augmentation to the thresholding method, we developed a score-based method that leverages the longitudinal characteristics of EHR data for detection of the observations inconsistent with the history of a patient. The method was applied to the height and weight data in the EHR from the Million Veteran Program Data from the Veteran's Healthcare Administration (VHA). A validation study was also conducted.

Results

The receiver operating characteristic (ROC) metrics of the developed method outperforms the widely used thresholding method. It is also demonstrated that different quality assessment methods have a non-ignorable impact on the body mass index (BMI) classification calculated from height and weight data in the VHA's database.

Conclusions

The score-based method enables automated and scaled detection of the problematic data points in health care big data while allowing the investigators to select the high-quality data based on their need. Leveraging the longitudinal characteristics in EHR will significantly improve the QA performance.

SUBMITTER: Wang H 

PROVIDER: S-EPMC8529838 | biostudies-literature | 2021 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

A statistical quality assessment method for longitudinal observations in electronic health record data with an application to the VA million veteran program.

Wang Hui H   Belitskaya-Levy Ilana I   Wu Fan F   Lee Jennifer S JS   Shih Mei-Chiung MC   Tsao Philip S PS   Lu Ying Y  

BMC medical informatics and decision making 20211020 1


<h4>Background</h4>To describe an automated method for assessment of the plausibility of continuous variables collected in the electronic health record (EHR) data for real world evidence research use.<h4>Methods</h4>The most widely used approach in quality assessment (QA) for continuous variables is to detect the implausible numbers using prespecified thresholds. In augmentation to the thresholding method, we developed a score-based method that leverages the longitudinal characteristics of EHR d  ...[more]

Similar Datasets

| S-EPMC8636485 | biostudies-literature
| S-EPMC10327290 | biostudies-literature
| S-EPMC10732342 | biostudies-literature
| S-EPMC6710266 | biostudies-literature
| S-EPMC9132265 | biostudies-literature
| S-EPMC10239111 | biostudies-literature
| S-EPMC6568141 | biostudies-literature
| S-EPMC9583190 | biostudies-literature
| phs001672 | dbGaP
| S-EPMC8864634 | biostudies-literature