Unknown

Dataset Information

0

Accurate auto-labeling of chest X-ray images based on quantitative similarity to an explainable AI model.


ABSTRACT: The inability to accurately, efficiently label large, open-access medical imaging datasets limits the widespread implementation of artificial intelligence models in healthcare. There have been few attempts, however, to automate the annotation of such public databases; one approach, for example, focused on labor-intensive, manual labeling of subsets of these datasets to be used to train new models. In this study, we describe a method for standardized, automated labeling based on similarity to a previously validated, explainable AI (xAI) model-derived-atlas, for which the user can specify a quantitative threshold for a desired level of accuracy (the probability-of-similarity, pSim metric). We show that our xAI model, by calculating the pSim values for each clinical output label based on comparison to its training-set derived reference atlas, can automatically label the external datasets to a user-selected, high level of accuracy, equaling or exceeding that of human experts. We additionally show that, by fine-tuning the original model using the automatically labelled exams for retraining, performance can be preserved or improved, resulting in a highly accurate, more generalized model.

SUBMITTER: Kim D 

PROVIDER: S-EPMC8986787 | biostudies-literature | 2022 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate auto-labeling of chest X-ray images based on quantitative similarity to an explainable AI model.

Kim Doyun D   Chung Joowon J   Choi Jongmun J   Succi Marc D MD   Conklin John J   Longo Maria Gabriela Figueiro MGF   Ackman Jeanne B JB   Little Brent P BP   Petranovic Milena M   Kalra Mannudeep K MK   Lev Michael H MH   Do Synho S  

Nature communications 20220406 1


The inability to accurately, efficiently label large, open-access medical imaging datasets limits the widespread implementation of artificial intelligence models in healthcare. There have been few attempts, however, to automate the annotation of such public databases; one approach, for example, focused on labor-intensive, manual labeling of subsets of these datasets to be used to train new models. In this study, we describe a method for standardized, automated labeling based on similarity to a p  ...[more]

Similar Datasets

| S-EPMC9497601 | biostudies-literature
| S-EPMC10942047 | biostudies-literature
| S-EPMC10987553 | biostudies-literature
| S-EPMC7435602 | biostudies-literature
| S-EPMC9937995 | biostudies-literature
| S-EPMC11681242 | biostudies-literature
| S-EPMC9222116 | biostudies-literature
| S-EPMC8966154 | biostudies-literature
| S-EPMC9687210 | biostudies-literature
| S-EPMC9715345 | biostudies-literature