Unknown

Dataset Information

0

Comparisons of the prediction models for undiagnosed diabetes between machine learning versus traditional statistical methods.


ABSTRACT: We compared the prediction performance of machine learning-based undiagnosed diabetes prediction models with that of traditional statistics-based prediction models. We used the 2014-2020 Korean National Health and Nutrition Examination Survey (KNHANES) (N = 32,827). The KNHANES 2014-2018 data were used as training and internal validation sets and the 2019-2020 data as external validation sets. The receiver operating characteristic curve area under the curve (AUC) was used to compare the prediction performance of the machine learning-based and the traditional statistics-based prediction models. Using sex, age, resting heart rate, and waist circumference as features, the machine learning-based model showed a higher AUC (0.788 vs. 0.740) than that of the traditional statistical-based prediction model. Using sex, age, waist circumference, family history of diabetes, hypertension, alcohol consumption, and smoking status as features, the machine learning-based prediction model showed a higher AUC (0.802 vs. 0.759) than the traditional statistical-based prediction model. The machine learning-based prediction model using features for maximum prediction performance showed a higher AUC (0.819 vs. 0.765) than the traditional statistical-based prediction model. Machine learning-based prediction models using anthropometric and lifestyle measurements may outperform the traditional statistics-based prediction models in predicting undiagnosed diabetes.

SUBMITTER: Choi SG 

PROVIDER: S-EPMC10421881 | biostudies-literature | 2023 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Comparisons of the prediction models for undiagnosed diabetes between machine learning versus traditional statistical methods.

Choi Seong Gyu SG   Oh Minsuk M   Park Dong-Hyuk DH   Lee Byeongchan B   Lee Yong-Ho YH   Jee Sun Ha SH   Jeon Justin Y JY  

Scientific reports 20230811 1


We compared the prediction performance of machine learning-based undiagnosed diabetes prediction models with that of traditional statistics-based prediction models. We used the 2014-2020 Korean National Health and Nutrition Examination Survey (KNHANES) (N = 32,827). The KNHANES 2014-2018 data were used as training and internal validation sets and the 2019-2020 data as external validation sets. The receiver operating characteristic curve area under the curve (AUC) was used to compare the predicti  ...[more]

Similar Datasets

| S-EPMC10952221 | biostudies-literature
| S-EPMC11571240 | biostudies-literature
| S-EPMC10413234 | biostudies-literature
| S-EPMC11877273 | biostudies-literature
| S-EPMC7667810 | biostudies-literature
| S-EPMC10218433 | biostudies-literature
2017-06-12 | PXD003608 | Pride
| S-EPMC9951458 | biostudies-literature
| S-EPMC10428032 | biostudies-literature
| S-EPMC11784135 | biostudies-literature