BackgroundAppropriate empirical treatment for candidemia is associated with reduced mortality; however, the timely diagnosis of candidemia in patients with sepsis remains poor.
ObjectiveWe aimed to use machine learning algorithms to develop and validate a candidemia prediction model for patients with cancer.
MethodsWe conducted a single-center retrospective study using the cancer registry of a tertiary academic hospital. Adult patients diagnosed with malignancies between January 2010 and December 2018 were included. Our study outcome was the prediction of candidemia events. A stratified undersampling method was used to extract control data for algorithm learning. Multiple models were developed-a combination of 4 variable groups and 5 algorithms (auto-machine learning, deep neural network, gradient boosting, logistic regression, and random forest). The model with the largest area under the receiver operating characteristic curve (AUROC) was selected as the Candida detection (CanDETEC) model after comparing its performance indexes with those of the Candida Score Model.
ResultsFrom a total of 273,380 blood cultures from 186,404 registered patients with cancer, we extracted 501 records of candidemia events and 2000 records as control data. Performance among the different models varied (AUROC 0.771- 0.889), with all models demonstrating superior performance to that of the Candida Score (AUROC 0.677). The random forest model performed the best (AUROC 0.889, 95% CI 0.888-0.889); therefore, it was selected as the CanDETEC model.
ConclusionsThe CanDETEC model predicted candidemia in patients with cancer with high discriminative power. This algorithm could be used for the timely diagnosis and appropriate empirical treatment of candidemia.