Project description:BackgroundIn 2020, our center established a Tanner-Whitehouse 3 (TW3) artificial intelligence (AI) system using a convolutional neural network (CNN), which was built upon 9059 radiographs. However, the system, upon which our study is based, lacked a gold standard for comparison and had not undergone thorough evaluation in different working environments.MethodsTo further verify the applicability of the AI system in clinical bone age assessment (BAA) and to enhance the accuracy and homogeneity of BAA, a prospective multi-center validation was conducted. This study utilized 744 left-hand radiographs of patients, ranging from 1 to 20 years of age, with 378 boys and 366 girls. These radiographs were obtained from nine different children's hospitals between August and December 2020. The BAAs were performed using the TW3 AI system and were also reviewed by experienced reviewers. Bone age accuracy within 1 year, root mean square error (RMSE), and mean absolute error (MAE) were statistically calculated to evaluate the accuracy. Kappa test and Bland-Altman (B-A) plot were conducted to measure the diagnostic consistency.ResultsThe system exhibited a high level of performance, producing results that closely aligned with those of the reviewers. It achieved a RMSE of 0.52 years and an accuracy of 94.55% for the radius, ulna, and short bones series. When assessing the carpal series of bones, the system achieved a RMSE of 0.85 years and an accuracy of 80.38%. Overall, the system displayed satisfactory accuracy and RMSE, particularly in patients over 7 years old. The system excelled in evaluating the carpal bone age of patients aged 1-6. Both the Kappa test and B-A plot demonstrated substantial consistency between the system and the reviewers, although the model encountered challenges in consistently distinguishing specific bones, such as the capitate. Furthermore, the system's performance proved acceptable across different genders and age groups, as well as radiography instruments.ConclusionsIn this multi-center validation, the system showcased its potential to enhance the efficiency and consistency of healthy delivery, ultimately resulting in improved patient outcomes and reduced healthcare costs.

Project description:BackgroundSurgical site infection (SSI) is one of the most common types of health care-associated infections. It increases mortality, prolongs hospital length of stay, and raises health care costs. Many institutions developed risk assessment models for SSI to help surgeons preoperatively identify high-risk patients and guide clinical intervention. However, most of these models had low accuracies.ObjectiveWe aimed to provide a solution in the form of an Artificial intelligence-based Multimodal Risk Assessment Model for Surgical site infection (AMRAMS) for inpatients undergoing operations, using routinely collected clinical data. We internally and externally validated the discriminations of the models, which combined various machine learning and natural language processing techniques, and compared them with the National Nosocomial Infections Surveillance (NNIS) risk index.MethodsWe retrieved inpatient records between January 1, 2014, and June 30, 2019, from the electronic medical record (EMR) system of Rui Jin Hospital, Luwan Branch, Shanghai, China. We used data from before July 1, 2018, as the development set for internal validation and the remaining data as the test set for external validation. We included patient demographics, preoperative lab results, and free-text preoperative notes as our features. We used word-embedding techniques to encode text information, and we trained the LASSO (least absolute shrinkage and selection operator) model, random forest model, gradient boosting decision tree (GBDT) model, convolutional neural network (CNN) model, and self-attention network model using the combined data. Surgeons manually scored the NNIS risk index values.ResultsFor internal bootstrapping validation, CNN yielded the highest mean area under the receiver operating characteristic curve (AUROC) of 0.889 (95% CI 0.886-0.892), and the paired-sample t test revealed statistically significant advantages as compared with other models (P<.001). The self-attention network yielded the second-highest mean AUROC of 0.882 (95% CI 0.878-0.886), but the AUROC was only numerically higher than the AUROC of the third-best model, GBDT with text embeddings (mean AUROC 0.881, 95% CI 0.878-0.884, P=.47). The AUROCs of LASSO, random forest, and GBDT models using text embeddings were statistically higher than the AUROCs of models not using text embeddings (P<.001). For external validation, the self-attention network yielded the highest AUROC of 0.879. CNN was the second-best model (AUROC 0.878), and GBDT with text embeddings was the third-best model (AUROC 0.872). The NNIS risk index scored by surgeons had an AUROC of 0.651.ConclusionsOur AMRAMS based on EMR data and deep learning methods-CNN and self-attention network-had significant advantages in terms of accuracy compared with other conventional machine learning methods and the NNIS risk index. Moreover, the semantic embeddings of preoperative notes improved the model performance further. Our models could replace the NNIS risk index to provide personalized guidance for the preoperative intervention of SSIs. Through this case, we offered an easy-to-implement solution for building multimodal RAMs for other similar scenarios.

Dataset Information

An artificial intelligence-based bone age assessment model for Han and Tibetan children

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets