Unknown

Dataset Information

0

Statistical quantification of confounding bias in machine learning models.


ABSTRACT:

Background

The lack of nonparametric statistical tests for confounding bias significantly hampers the development of robust, valid, and generalizable predictive models in many fields of research. Here I propose the partial confounder test, which, for a given confounder variable, probes the null hypotheses of the model being unconfounded.

Results

The test provides a strict control for type I errors and high statistical power, even for nonnormally and nonlinearly dependent predictions, often seen in machine learning. Applying the proposed test on models trained on large-scale functional brain connectivity data (N= 1,865) (i) reveals previously unreported confounders and (ii) shows that state-of-the-art confound mitigation approaches may fail preventing confounder bias in several cases.

Conclusions

The proposed test (implemented in the package mlconfound; https://mlconfound.readthedocs.io) can aid the assessment and improvement of the generalizability and validity of predictive models and, thereby, fosters the development of clinically useful machine learning biomarkers.

SUBMITTER: Spisak T 

PROVIDER: S-EPMC9412867 | biostudies-literature | 2022 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Statistical quantification of confounding bias in machine learning models.

Spisak Tamas T  

GigaScience 20220801


<h4>Background</h4>The lack of nonparametric statistical tests for confounding bias significantly hampers the development of robust, valid, and generalizable predictive models in many fields of research. Here I propose the partial confounder test, which, for a given confounder variable, probes the null hypotheses of the model being unconfounded.<h4>Results</h4>The test provides a strict control for type I errors and high statistical power, even for nonnormally and nonlinearly dependent predictio  ...[more]

Similar Datasets

| S-EPMC7236480 | biostudies-literature
| S-EPMC11470166 | biostudies-literature
| S-EPMC9951458 | biostudies-literature
| S-EPMC9365193 | biostudies-literature
| S-EPMC10952221 | biostudies-literature
| S-EPMC11306200 | biostudies-literature
| S-EPMC10539075 | biostudies-literature
| S-EPMC7835549 | biostudies-literature
| S-EPMC7487379 | biostudies-literature
| S-EPMC7966799 | biostudies-literature