Project description:ObjectivesEvidence-based protocols for managing bleeding emergencies in patients with immune thrombocytopenia (ITP) are lacking. We conducted a systematic review of treatments for critical bleeding in patients with ITP.MethodsWe included all study designs and extracted data in aggregate or individually for patients who received one or more interventions and for whom any of the following outcomes were reported: platelet count response, bleeding, disability, or death.ResultsWe identified 49 eligible studies reporting 112 critical bleed patients with ITP, including 66 children (median age, 10 years), 36 adults (median age, 41.5 years), and 10 patients with unreported age. Patients received corticosteroids (n = 67), IVIG (n = 49), platelet transfusions (n = 41), TPO-RAs (n = 17), and splenectomy (n = 28) either alone or in combination. Studies reported 29 different treatment combinations, the 5 most common were corticosteroids, platelet transfusion and splenectomy (n = 13), corticosteroids and IVIG (n = 13), or splenectomy alone (n = 13); IVIG alone (n = 11); and corticosteroids, IVIG and TPO-RA (n = 8). Mortality among patients with critical bleeds in ITP was 30.6% for adults and 19.7% for children.ConclusionsThe effects of individual treatments on patient outcomes were uncertain due to very low-quality evidence. There is a need for a standardized approach to the treatment of ITP critical bleeds.Systematic review registrationCRD42020161206.
Project description:Drug-induced immune thrombocytopenia (DITP) often occurs in patients receiving many drug treatments simultaneously. However, clinicians usually fail to accurately distinguish which drugs can be plausible culprits. Despite significant advances in laboratory-based DITP testing, in vitro experimental assays have been expensive and, in certain cases, cannot provide a timely diagnosis to patients. To address these shortcomings, this paper proposes an efficient machine learning-based method for DITP toxicity prediction. A small dataset consisting of 225 molecules was constructed. The molecules were represented by six fingerprints, three descriptors, and their combinations. Seven classical machine learning-based models were examined to determine an optimal model. The results show that the RDMD + PubChem-k-NN model provides the best prediction performance among all the models, achieving an area under the curve of 76.9% and overall accuracy of 75.6% on the external validation set. The application domain (AD) analysis demonstrates the prediction reliability of the RDMD + PubChem-k-NN model. Five structural fragments related to the DITP toxicity are identified through information gain (IG) method along with fragment frequency analysis. Overall, as far as known, it is the first machine learning-based classification model for recognizing chemicals with DITP toxicity and can be used as an efficient tool in drug design and clinical therapy.
Project description:Increasingly available open medical and health datasets encourage data-driven research with a promise of improving patient care through knowledge discovery and algorithm development. Among efficient approaches to such high-dimensional problems are a number of machine learning methods, which are applied in this paper to pressure ulcer prediction in modular critical care data. An inherent property of many health-related datasets is a high number of irregularly sampled time-variant and scarcely populated features, often exceeding the number of observations. Although machine learning methods are known to work well under such circumstances, many choices regarding model and data processing exist. In particular, this paper address both theoretical and practical aspects related to the application of six classification models to pressure ulcers, while utilizing one of the largest available Medical Information Mart for Intensive Care (MIMIC-IV) databases. Random forest, with an accuracy of 96%, is the best-performing approach among the considered machine learning algorithms.
Project description:Although many studies have been conducted on machine learning (ML) models for Parkinson's disease (PD) prediction using neuroimaging and movement analyses, studies with large population-based datasets are limited. We aimed to propose PD prediction models using ML algorithms based on the National Health Insurance Service-Health Screening datasets. We selected individuals who participated in national health-screening programs > 5 times between 2002 and 2015. PD was defined based on the ICD-code (G20), and a matched cohort of individuals without PD was selected using a 1:1 random sampling method. Various ML algorithms were applied for PD prediction, and the performance of the prediction models was compared. Neural networks, gradient boosting machines, and random forest algorithms exhibited the best average prediction accuracy (average area under the receiver operating characteristic curve (AUC): 0.779, 0.766, and 0.731, respectively) among the algorithms validated in this study. The overall model performance metrics were higher in men than in women (AUC: 0.742 and 0.729, respectively). The most important factor for predicting PD occurrence was body mass index, followed by total cholesterol, glucose, hemoglobin, and blood pressure levels. Smoking and alcohol consumption (in men) and socioeconomic status, physical activity, and diabetes mellitus (in women) were highly correlated with the occurrence of PD. The proposed health-screening dataset-based PD prediction model using ML algorithms is readily applicable, produces validated results, and could be a useful option for PD prediction models.
Project description:Gene expression profiles were generated from 199 primary breast cancer patients. Samples 1-176 were used in another study, GEO Series GSE22820, and form the training data set in this study. Sample numbers 200-222 form a validation set. This data is used to model a machine learning classifier for Estrogen Receptor Status. RNA was isolated from 199 primary breast cancer patients. A machine learning classifier was built to predict ER status using only three gene features.
Project description:Preoperative risk assessment is essential for shared decision-making and adequate perioperative care. Common scores provide limited predictive quality and lack personalized information. The aim of this study was to create an interpretable machine-learning-based model to assess the patient's individual risk of postoperative mortality based on preoperative data to allow analysis of personal risk factors. After ethical approval, a model for prediction of postoperative in-hospital mortality based on preoperative data of 66,846 patients undergoing elective non-cardiac surgery between June 2014 and March 2020 was created with extreme gradient boosting. Model performance and the most relevant parameters were shown using receiver operating characteristic (ROC-) and precision-recall (PR-) curves and importance plots. Individual risks of index patients were presented in waterfall diagrams. The model included 201 features and showed good predictive abilities with an area under receiver operating characteristic (AUROC) curve of 0.95 and an area under precision-recall curve (AUPRC) of 0.109. The feature with the highest information gain was the preoperative order for red packed cell concentrates followed by age and c-reactive protein. Individual risk factors could be identified on patient level. We created a highly accurate and interpretable machine learning model to preoperatively predict the risk of postoperative in-hospital mortality. The algorithm can be used to identify factors susceptible to preoperative optimization measures and to identify risk factors influencing individual patient risk.
Project description:BackgroundThis study introduced machine learning approaches to predict newborn's body mass index (BMI) based on ultrasound measures and maternal/delivery information.MethodsData came from 3159 obstetric patients and their newborns enrolled in a multi-center retrospective study. Variable importance, the effect of a variable on model performance, was used for identifying major predictors of newborn's BMI among ultrasound measures and maternal/delivery information. The ultrasound measures included biparietal diameter (BPD), abdominal circumference (AC) and estimated fetal weight (EFW) taken three times during the week 21 - week 35 of gestational age and once in the week 36 or later.ResultsBased on variable importance from the random forest, major predictors of newborn's BMI were the first AC and EFW in the week 36 or later, gestational age at delivery, the first AC during the week 21 - the week 35, maternal BMI at delivery, maternal weight at delivery and the first BPD in the week 36 or later. For predicting newborn's BMI, linear regression (2.0744) and the random forest (2.1610) were better than artificial neural networks with one, two and three hidden layers (150.7100, 154.7198 and 152.5843, respectively) in the mean squared error.ConclusionsThis is the first machine-learning study with 64 clinical and sonographic markers for the prediction of newborns' BMI. The week 36 or later is the most effective period for taking the ultrasound measures and AC and EFW are the best predictors of newborn's BMI alongside gestational age at delivery and maternal BMI at delivery.
Project description:The present study introduces a novel approach utilizing machine learning techniques to predict the crucial mechanical properties of engineered cementitious composites (ECCs), spanning from typical to exceptionally high strength levels. These properties, including compressive strength, flexural strength, tensile strength, and tensile strain capacity, can not only be predicted but also precisely estimated. The investigation encompassed a meticulous compilation and examination of 1532 datasets sourced from pertinent research. Four machine learning algorithms, linear regression (LR), K nearest neighbors (KNN), random forest (RF), and extreme gradient boosting (XGB), were used to establish the prediction model of ECC mechanical properties and determine the optimal model. The optimal model was utilized to employ SHapley Additive exPlanations (SHAP) for scrutinizing feature importance and conducting an in-depth parametric analysis. Subsequently, a comprehensive control strategy was devised for ECC mechanical properties. This strategy can provide actionable guidance for ECC design, equipping engineers and professionals in civil engineering and material science to make informed decisions throughout their design endeavors. The results show that the RF model demonstrated the highest prediction accuracy for compressive strength and flexural strength, with R2 values of 0.92 and 0.91 on the test set. The XGB model outperformed in predicting tensile strength and tensile strain capacity, with R2 values of 0.87 and 0.80 on the test set, respectively. The prediction of tensile strain capacity was the least accurate. Meanwhile, the MAE of the tensile strain capacity was a mere 0.84%, smaller than the variability (1.77%) of the test results in previous research. Compressive strength and tensile strength demonstrated high sensitivity to variations in both water-cement ratio (W) and water reducer (WR). In contrast, flexural strength exhibited high sensitivity solely to changes in W. Conversely, the sensitivity of tensile strain capacity to input features was moderate and consistent. The mechanical attributes of ECC emerged from the combined effects of multiple positive and negative features. Notably, WR exerted the most significant influence on compressive strength among all features, whereas polyethylene (PE) fiber emerged as the primary driver affecting flexural strength, tensile strength, and tensile strain capacity.
Project description:Early childhood asthma diagnosis is common; however, many children diagnosed before age 5 experience symptom resolution and it remains difficult to identify individuals whose symptoms will persist. Our objective was to develop machine learning models to identify which individuals diagnosed with asthma before age 5 continue to experience asthma-related visits. We curated a retrospective dataset for 9,934 children derived from electronic health record (EHR) data. We trained five machine learning models to differentiate individuals without subsequent asthma-related visits (transient diagnosis) from those with asthma-related visits between ages 5 and 10 (persistent diagnosis) given clinical information up to age 5 years. Based on average NPV-Specificity area (ANSA), all models performed significantly better than random chance, with XGBoost obtaining the best performance (0.43 mean ANSA). Feature importance analysis indicated age of last asthma diagnosis under 5 years, total number of asthma related visits, self-identified black race, allergic rhinitis, and eczema as important features. Although our models appear to perform well, a lack of prior models utilizing a large number of features to predict individual persistence makes direct comparison infeasible. However, feature importance analysis indicates our models are consistent with prior research indicating diagnosis age and prior health service utilization as important predictors of persistent asthma. We therefore find that machine learning models can predict which individuals will experience persistent asthma with good performance and may be useful to guide clinician and parental decisions regarding asthma counselling in early childhood.
Project description:Lung cancers with a mutated epidermal growth factor receptor (EGFR) are a major contributor to cancer fatalities globally. Targeted tyrosine kinase inhibitors (TKIs) have been developed against EGFR and show encouraging results for survival rate and quality of life. However, drug resistance may affect treatment plans and treatment efficacy may be lost after about a year. Predicting the response to EGFR-TKIs for EGFR-mutated lung cancer patients is a key research area. In this study, we propose a personalized drug response prediction model (PDRP), based on molecular dynamics simulations and machine learning, to predict the response of first generation FDA-approved small molecule EGFR-TKIs, Gefitinib/Erlotinib, in lung cancer patients. The patient's mutation status is taken into consideration in molecular dynamics (MD) simulation. Each patient's unique mutation status was modeled considering MD simulation to extract molecular-level geometric features. Moreover, additional clinical features were incorporated into machine learning model for drug response prediction. The complete feature set includes demographic and clinical information (DCI), geometrical properties of the drug-target binding site, and the binding free energy of the drug-target complex from the MD simulation. PDRP incorporates an XGBoost classifier, which achieves state-of-the-art performance with 97.5% accuracy, 93% recall, 96.5% precision, and 94% F1-score, for a 4-class drug response prediction task. We found that modeling the geometry of the binding pocket combined with binding free energy is a good predictor for drug response. However, we observed that clinical information had a little impact on the performance of the model. The proposed model could be tested on other types of cancers. We believe PDRP will support the planning of effective treatment regimes based on clinical-genomic information. The source code and related files are available on GitHub at: https://github.com/rizwanqureshi123/PDRP/ .