Dataset Information

Inter-observer variability of expert-derived morphologic risk predictors in aortic dissection.

ABSTRACT:

Objectives

Establishing the reproducibility of expert-derived measurements on CTA exams of aortic dissection is clinically important and paramount for ground-truth determination for machine learning.

Methods

Four independent observers retrospectively evaluated CTA exams of 72 patients with uncomplicated Stanford type B aortic dissection and assessed the reproducibility of a recently proposed combination of four morphologic risk predictors (maximum aortic diameter, false lumen circumferential angle, false lumen outflow, and intercostal arteries). For the first inter-observer variability assessment, 47 CTA scans from one aortic center were evaluated by expert-observer 1 in an unconstrained clinical assessment without a standardized workflow and compared to a composite of three expert-observers (observers 2-4) using a standardized workflow. A second inter-observer variability assessment on 30 out of the 47 CTA scans compared observers 3 and 4 with a constrained, standardized workflow. A third inter-observer variability assessment was done after specialized training and tested between observers 3 and 4 in an external population of 25 CTA scans. Inter-observer agreement was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots.

Results

Pre-training ICCs of the four morphologic features ranged from 0.04 (-0.05 to 0.13) to 0.68 (0.49-0.81) between observer 1 and observers 2-4 and from 0.50 (0.32-0.69) to 0.89 (0.78-0.95) between observers 3 and 4. ICCs improved after training ranging from 0.69 (0.52-0.87) to 0.97 (0.94-0.99), and Bland-Altman analysis showed decreased bias and limits of agreement.

Conclusions

Manual morphologic feature measurements on CTA images can be optimized resulting in improved inter-observer reliability. This is essential for robust ground-truth determination for machine learning models.

Key points

• Clinical fashion manual measurements of aortic CTA imaging features showed poor inter-observer reproducibility. • A standardized workflow with standardized training resulted in substantial improvements with excellent inter-observer reproducibility. • Robust ground truth labels obtained manually with excellent inter-observer reproducibility are key to develop reliable machine learning models.

SUBMITTER: Willemink MJ

PROVIDER: S-EPMC10017115 | biostudies-literature | 2023 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Inter-observer variability of expert-derived morphologic risk predictors in aortic dissection.

Willemink Martin J MJ Mastrodicasa Domenico D Madani Mohammad H MH Codari Marina M Chepelev Leonid L LL Mistelbauer Gabriel G Hanneman Kate K Ouzounian Maral M Ocazionez Daniel D Afifi Rana O RO Lacomis Joan M JM Lovato Luigi L Pacini Davide D Folesani Gianluca G Hinzpeter Ricarda R Alkadhi Hatem H Stillman Arthur E AE Sailer Anna M AM Turner Valery L VL Hinostroza Virginia V Bäumler Kathrin K Chin Anne S AS Burris Nicholas S NS Miller D Craig DC Fischbein Michael P MP Fleischmann Dominik D

European radiology 20220827 2

<h4>Objectives</h4>Establishing the reproducibility of expert-derived measurements on CTA exams of aortic dissection is clinically important and paramount for ground-truth determination for machine learning.<h4>Methods</h4>Four independent observers retrospectively evaluated CTA exams of 72 patients with uncomplicated Stanford type B aortic dissection and assessed the reproducibility of a recently proposed combination of four morphologic risk predictors (maximum aortic diameter, false lumen circ ...[more]

PMID: 36029344

Similar Datasets

Project description:BackgroundPET-based tumor delineation is an error prone and labor intensive part of image analysis. Especially for patients with advanced disease showing bulky tumor FDG load, segmentations are challenging. Reducing the amount of user-interaction in the segmentation might help to facilitate segmentation tasks especially when labeling bulky and complex tumors. Therefore, this study reports on segmentation workflows/strategies that may reduce the inter-observer variability for large tumors with complex shapes with different levels of user-interaction.MethodsTwenty PET images of bulky tumors were delineated independently by six observers using four strategies: (I) manual, (II) interactive threshold-based, (III) interactive threshold-based segmentation with the additional presentation of the PET-gradient image and (IV) the selection of the most reasonable result out of four established semi-automatic segmentation algorithms (Select-the-best approach). The segmentations were compared using Jaccard coefficients (JC) and percentage volume differences. To obtain a reference standard, a majority vote (MV) segmentation was calculated including all segmentations of experienced observers. Performed and MV segmentations were compared regarding positive predictive value (PPV), sensitivity (SE), and percentage volume differences.ResultsThe results show that with decreasing user-interaction the inter-observer variability decreases. JC values and percentage volume differences of Select-the-best and a workflow including gradient information were significantly better than the measurements of the other segmentation strategies (p-value<0.01). Interactive threshold-based and manual segmentations also result in significant lower and more variable PPV/SE values when compared with the MV segmentation.ConclusionsFDG PET segmentations of bulky tumors using strategies with lower user-interaction showed less inter-observer variability. None of the methods led to good results in all cases, but use of either the gradient or the Select-the-best workflow did outperform the other strategies tested and may be a good candidate for fast and reliable labeling of bulky and heterogeneous tumors.

Dataset Information

Inter-observer variability of expert-derived morphologic risk predictors in aortic dissection.

Objectives

Methods

Results

Conclusions

Key points

Publications

Inter-observer variability of expert-derived morphologic risk predictors in aortic dissection.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets