Unknown

Dataset Information

0

Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data.


ABSTRACT:

Objectives

Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health and education data resource, we examined how machine learning (ML) approaches can predict risk of ADHD.

Design

Retrospective population cohort study.

Setting

South London (2007-2013).

Participants

n=56 258 pupils with linked education and health data.

Primary outcome measures

Using area under the curve (AUC), we compared the predictive accuracy of four ML models and one neural network for ADHD diagnosis. Ethnic group and language biases were weighted using a fair pre-processing algorithm.

Results

Random forest and logistic regression prediction models provided the highest predictive accuracy for ADHD in population samples (AUC 0.86 and 0.86, respectively) and clinical samples (AUC 0.72 and 0.70). Precision-recall curve analyses were less favourable. Sociodemographic biases were effectively reduced by a fair pre-processing algorithm without loss of accuracy.

Conclusions

ML approaches using linked routinely collected education and health data offer accurate, low-cost and scalable prediction models of ADHD. These approaches could help identify areas of need and inform resource allocation. Introducing 'fairness weighting' attenuates some sociodemographic biases which would otherwise underestimate ADHD risk within minority groups.

SUBMITTER: Ter-Minassian L 

PROVIDER: S-EPMC9723859 | biostudies-literature | 2022 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data.

Ter-Minassian Lucile L   Viani Natalia N   Wickersham Alice A   Cross Lauren L   Stewart Robert R   Velupillai Sumithra S   Downs Johnny J  

BMJ open 20221205 12


<h4>Objectives</h4>Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health and education data resource, we examined how machine learning (ML) approaches can predict risk of ADHD.<h4>Design</h4>Retrospective population cohort study.<h4>Setting</h4>South London (2007-2  ...[more]

Similar Datasets

| S-EPMC4166140 | biostudies-literature
| S-EPMC10622966 | biostudies-literature
| S-EPMC9991779 | biostudies-literature
| S-EPMC10294679 | biostudies-literature
| S-EPMC9536596 | biostudies-literature
| S-EPMC4856601 | biostudies-literature
| S-EPMC7064332 | biostudies-literature
| S-EPMC9319405 | biostudies-literature
| S-EPMC11320197 | biostudies-literature
| S-EPMC5362261 | biostudies-literature