Dataset Information

Automated Explainable Multidimensional Deep Learning Platform of Retinal Images for Retinopathy of Prematurity Screening.

ABSTRACT:

Importance

A retinopathy of prematurity (ROP) diagnosis currently relies on indirect ophthalmoscopy assessed by experienced ophthalmologists. A deep learning algorithm based on retinal images may facilitate early detection and timely treatment of ROP to improve visual outcomes.

Objective

To develop a retinal image-based, multidimensional, automated, deep learning platform for ROP screening and validate its performance accuracy.

Design, setting, and participants

A total of 14 108 eyes of 8652 preterm infants who received ROP screening from 4 centers from November 4, 2010, to November 14, 2019, were included, and a total of 52 249 retinal images were randomly split into training, validation, and test sets. Four main dimensional independent classifiers were developed, including image quality, any stage of ROP, intraocular hemorrhage, and preplus/plus disease. Referral-warranted ROP was automatically generated by integrating the results of 4 classifiers at the image, eye, and patient levels. DeepSHAP, a method based on DeepLIFT and Shapley values (solution concepts in cooperative game theory), was adopted as the heat map technology to explain the predictions. The performance of the platform was further validated as compared with that of the experienced ROP experts. Data were analyzed from February 12, 2020, to June 24, 2020.

Exposure

A deep learning algorithm.

Main outcomes and measures

The performance of each classifier included true negative, false positive, false negative, true positive, F1 score, sensitivity, specificity, receiver operating characteristic, area under curve (AUC), and Cohen unweighted κ.

Results

A total of 14 108 eyes of 8652 preterm infants (mean [SD] gestational age, 32.9 [3.1] weeks; 4818 boys [60.4%] of 7973 with known sex) received ROP screening. The performance of all classifiers achieved an F1 score of 0.718 to 0.981, a sensitivity of 0.918 to 0.982, a specificity of 0.949 to 0.992, and an AUC of 0.983 to 0.998, whereas that of the referral system achieved an F1 score of 0.898 to 0.956, a sensitivity of 0.981 to 0.986, a specificity of 0.939 to 0.974, and an AUC of 0.9901 to 0.9956. Fine-grained and class-discriminative heat maps were generated by DeepSHAP in real time. The platform achieved a Cohen unweighted κ of 0.86 to 0.98 compared with a Cohen κ of 0.93 to 0.98 by the ROP experts.

Conclusions and relevance

In this diagnostic study, an automated ROP screening platform was able to identify and classify multidimensional pathologic lesions in the retinal images. This platform may be able to assist routine ROP screening in general and children hospitals.

SUBMITTER: Wang J

PROVIDER: S-EPMC8100867 | biostudies-literature | 2021 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Automated Explainable Multidimensional Deep Learning Platform of Retinal Images for Retinopathy of Prematurity Screening.

Wang Ji J Ji Jie J Zhang Mingzhi M Lin Jian-Wei JW Zhang Guihua G Gong Weifen W Cen Ling-Ping LP Lu Yamei Y Huang Xuelin X Huang Dingguo D Li Taiping T Ng Tsz Kin TK Pang Chi Pui CP

JAMA network open 20210503 5

<h4>Importance</h4>A retinopathy of prematurity (ROP) diagnosis currently relies on indirect ophthalmoscopy assessed by experienced ophthalmologists. A deep learning algorithm based on retinal images may facilitate early detection and timely treatment of ROP to improve visual outcomes.<h4>Objective</h4>To develop a retinal image-based, multidimensional, automated, deep learning platform for ROP screening and validate its performance accuracy.<h4>Design, setting, and participants</h4>A total of 1 ...[more]

PMID: 33950206

Similar Datasets

Project description:ObjectiveTo develop and validate a real-world screening, guideline-based deep learning (DL) system for referable diabetic retinopathy (DR) detection.DesignThis is a multicentre platform development study based on retrospective, cross-sectional data sets. Images were labelled by two-level certificated graders as the ground truth. According to the UK DR screening guideline, a DL model based on colour retinal images with five-dimensional classifiers, namely image quality, retinopathy, maculopathy gradability, maculopathy and photocoagulation, was developed. Referable decisions were generated by integrating the output of all classifiers and reported at the image, eye and patient level. The performance of the DL was compared with DR experts.SettingDR screening programmes from three hospitals and the Lifeline Express Diabetic Retinopathy Screening Program in China.Participants83 465 images of 39 836 eyes from 21 716 patients were annotated, of which 53 211 images were used as the development set and 30 254 images were used as the external validation set, split based on centre and period.Main outcomesAccuracy, F1 score, sensitivity, specificity, area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), Cohen's unweighted κ and Gwet's AC1 were calculated to evaluate the performance of the DL algorithm.ResultsIn the external validation set, the five classifiers achieved an accuracy of 0.915-0.980, F1 score of 0.682-0.966, sensitivity of 0.917-0.978, specificity of 0.907-0.981, AUROC of 0.9639-0.9944 and AUPRC of 0.7504-0.9949. Referable DR at three levels was detected with an accuracy of 0.918-0.967, F1 score of 0.822-0.918, sensitivity of 0.970-0.971, specificity of 0.905-0.967, AUROC of 0.9848-0.9931 and AUPRC of 0.9527-0.9760. With reference to the ground truth, the DL system showed comparable performance (Cohen's κ: 0.86-0.93; Gwet's AC1: 0.89-0.94) with three DR experts (Cohen's κ: 0.89-0.96; Gwet's AC1: 0.91-0.97) in detecting referable lesions.ConclusionsThe automatic DL system for detection of referable DR based on the UK guideline could achieve high accuracy in multidimensional classifications. It is suitable for large-scale, real-world DR screening.

Project description:ImportanceRetinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable.ObjectiveTo implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs.Design, setting, and participantsA deep convolutional neural network was trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre-plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP. Data were collected from July 2011 to December 2016. Data were analyzed from December 2016 to September 2017.ExposuresA deep learning algorithm trained on retinal photographs.Main outcomes and measuresReceiver operating characteristic analysis was performed to evaluate performance of the algorithm against the RSD. Quadratic-weighted κ coefficients were calculated for ternary classification (ie, normal, pre-plus disease, and plus disease) to measure agreement with the RSD and 8 independent experts.ResultsOf the 5511 included retinal photographs, 4535 (82.3%) were graded as normal, 805 (14.6%) as pre-plus disease, and 172 (3.1%) as plus disease, based on the RSD. Mean (SD) area under the receiver operating characteristic curve statistics were 0.94 (0.01) for the diagnosis of normal (vs pre-plus disease or plus disease) and 0.98 (0.01) for the diagnosis of plus disease (vs normal or pre-plus disease). For diagnosis of plus disease in an independent test set of 100 retinal images, the algorithm achieved a sensitivity of 93% with 94% specificity. For detection of pre-plus disease or worse, the sensitivity and specificity were 100% and 94%, respectively. On the same test set, the algorithm achieved a quadratic-weighted κ coefficient of 0.92 compared with the RSD, outperforming 6 of 8 ROP experts.Conclusions and relevanceThis fully automated algorithm diagnosed plus disease in ROP with comparable or better accuracy than human experts. This has potential applications in disease detection, monitoring, and prognosis in infants at risk of ROP.

Project description:ImportanceMachine learning (ML) algorithms have the potential to identify eyes with early diabetic retinopathy (DR) at increased risk for disease progression.ObjectiveTo create and validate automated ML models (autoML) for DR progression from ultra-widefield (UWF) retinal images.Design, setting and participantsDeidentified UWF images with mild or moderate nonproliferative DR (NPDR) with 3 years of longitudinal follow-up retinal imaging or evidence of progression within 3 years were used to develop automated ML models for predicting DR progression in UWF images. All images were collected from a tertiary diabetes-specific medical center retinal image dataset. Data were collected from July to September 2022.ExposureAutomated ML models were generated from baseline on-axis 200° UWF retinal images. Baseline retinal images were labeled for progression based on centralized reading center evaluation of baseline and follow-up images according to the clinical Early Treatment Diabetic Retinopathy Study severity scale. Images for model development were split 8-1-1 for training, optimization, and testing to detect 1 or more steps of DR progression. Validation was performed using a 328-image set from the same patient population not used in model development.Main outcomes and measuresArea under the precision-recall curve (AUPRC), sensitivity, specificity, and accuracy.ResultsA total of 1179 deidentified UWF images with mild (380 [32.2%]) or moderate (799 [67.8%]) NPDR were included. DR progression was present in half of the training set (590 of 1179 [50.0%]). The model's AUPRC was 0.717 for baseline mild NPDR and 0.863 for moderate NPDR. On the validation set for eyes with mild NPDR, sensitivity was 0.72 (95% CI, 0.57-0.83), specificity was 0.63 (95% CI, 0.57-0.69), prevalence was 0.15 (95% CI, 0.12-0.20), and accuracy was 64.3%; for eyes with moderate NPDR, sensitivity was 0.80 (95% CI, 0.70-0.87), specificity was 0.72 (95% CI, 0.66-0.76), prevalence was 0.22 (95% CI, 0.19-0.27), and accuracy was 73.8%. In the validation set, 6 of 9 eyes (75%) with mild NPDR and 35 of 41 eyes (85%) with moderate NPDR progressed 2 steps or more were identified. All 4 eyes with mild NPDR that progressed within 6 months and 1 year were identified, and 8 of 9 (89%) and 17 of 20 (85%) with moderate NPDR that progressed within 6 months and 1 year, respectively, were identified.Conclusions and relevanceThis study demonstrates the accuracy and feasibility of automated ML models for identifying DR progression developed using UWF images, especially for prediction of 2-step or greater DR progression within 1 year. Potentially, the use of ML algorithms may refine the risk of disease progression and identify those at highest short-term risk, thus reducing costs and improving vision-related outcomes.

Dataset Information

Automated Explainable Multidimensional Deep Learning Platform of Retinal Images for Retinopathy of Prematurity Screening.

Importance

Objective

Design, setting, and participants

Exposure

Main outcomes and measures

Results

Conclusions and relevance

Publications

Automated Explainable Multidimensional Deep Learning Platform of Retinal Images for Retinopathy of Prematurity Screening.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets