Unknown

Dataset Information

0

Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission.


ABSTRACT:

Background

Machine learning (ML) algorithms have been trained to early predict critical in-hospital events from COVID-19 using patient data at admission, but little is known on how their performance compares with each other and/or with statistical logistic regression (LR). This prospective multicentre cohort study compares the performance of a LR and five ML models on the contribution of influencing predictors and predictor-to-event relationships on prediction model´s performance.

Methods

We used 25 baseline variables of 490 COVID-19 patients admitted to 8 hospitals in Germany (March-November 2020) to develop and validate (75/25 random-split) 3 linear (L1 and L2 penalty, elastic net [EN]) and 2 non-linear (support vector machine [SVM] with radial kernel, random forest [RF]) ML approaches for predicting critical events defined by intensive care unit transfer, invasive ventilation and/or death (composite end-point: 181 patients). Models were compared for performance (area-under-the-receiver-operating characteristic-curve [AUC], Brier score) and predictor importance (performance-loss metrics, partial-dependence profiles).

Results

Models performed close with a small benefit for LR (utilizing restricted cubic splines for non-linearity) and RF (AUC means: 0.763-0.731 [RF-L1]); Brier scores: 0.184-0.197 [LR-L1]). Top ranked predictor variables (consistently highest importance: C-reactive protein) were largely identical across models, except creatinine, which exhibited marginal (L1, L2, EN, SVM) or high/non-linear effects (LR, RF) on events.

Conclusions

Although the LR and ML models analysed showed no strong differences in performance and the most influencing predictors for COVID-19-related event prediction, our results indicate a predictive benefit from taking account for non-linear predictor-to-event relationships and effects. Future efforts should focus on leveraging data-driven ML technologies from static towards dynamic modelling solutions that continuously learn and adapt to changes in data environments during the evolving pandemic.

Trial registration number

NCT04659187.

SUBMITTER: Sievering AW 

PROVIDER: S-EPMC9702742 | biostudies-literature | 2022 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission.

Sievering Aaron W AW   Wohlmuth Peter P   Geßler Nele N   Gunawardene Melanie A MA   Herrlinger Klaus K   Bein Berthold B   Arnold Dirk D   Bergmann Martin M   Nowak Lorenz L   Gloeckner Christian C   Koch Ina I   Bachmann Martin M   Herborn Christoph U CU   Stang Axel A  

BMC medical informatics and decision making 20221128 1


<h4>Background</h4>Machine learning (ML) algorithms have been trained to early predict critical in-hospital events from COVID-19 using patient data at admission, but little is known on how their performance compares with each other and/or with statistical logistic regression (LR). This prospective multicentre cohort study compares the performance of a LR and five ML models on the contribution of influencing predictors and predictor-to-event relationships on prediction model´s performance.<h4>Met  ...[more]

Similar Datasets

| S-EPMC7037603 | biostudies-literature
| S-EPMC6786577 | biostudies-literature
| S-EPMC9804041 | biostudies-literature
| S-EPMC4672891 | biostudies-literature
| S-EPMC6094446 | biostudies-literature
| S-EPMC9596815 | biostudies-literature
| S-EPMC10648137 | biostudies-literature
| S-EPMC10165217 | biostudies-literature