Project description:BackgroundChemotherapy is a common treatment for patients with resected non-small cell lung cancer (NSCLC). However, there are few models for predicting the survival outcomes of these patients. Here, we developed a clinical nomogram for predicting overall survival (OS) in this cohort.MethodsA total of 16,661 patients with resected NSCLC treated with chemotherapy were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. We identified prognostic factors and integrated them into a nomogram. The model was subjected to bootstrap internal validation using the SEER database and external validation using a database in China and the National Cancer Database (NCDB). The model's predictive accuracy and discriminative ability were tested by calibration and concordance index (C-index).ResultsAge, sex, number of dissected lymph nodes, extent of surgery, N stage, T stage, and grade were independent factors for OS and were integrated into the model. The calibration curves for probability of 1-, 3-, and 5-year OS showed excellent agreement between the predicted and actual survivals. The C-index of the nomogram was higher than that of the Tumor-Node-Metastasis staging system for predicting OS (training cohort, 0.62 vs. 0.58; China cohort, 0.68 vs. 0.63; NCDB cohort, 0.59 vs. 0.57).ConclusionsWe developed a nomogram that can present individual prediction of OS for patients with resected NSCLC who are undergoing chemotherapy. This practical prognostic tool may help clinicians in treatment planning.
Project description:Azoospermia, defined by the absence of sperm in the ejaculate, manifests as obstructive azoospermia (OA) or non-obstructive azoospermia (NOA). Reliable predictive models utilizing biomarkers could aid in clinical decision-making. This study included 352 azoospermia patients, with 152 diagnosed with OA and 200 with NOA. The data were randomly divided into a training set (244 cases) and a validation set (108 cases) for machine learning analysis. The training set was utilized for univariate and multivariate logistic regression to identify key predictors of NOA. Following this, nine machine learning. This study included 352 azoospermia patients, with 152 diagnosed with OA and 200 with NOA. The data were randomly divided into a training set (244 cases) and a validation set (108 cases) for machine learning analysis. The training set was utilized for univariate and multivariate logistic regression to identify key predictors of NOA. Following this, nine machine learning methods were employed to refine the prediction model. A novel nomogram model was developed, and its predictive performance was evaluated using receiver operating characteristic curves, calibration plots, and decision curve analysis. Univariate and multivariate logistic regression analyses identified semen pH and follicle-stimulating hormone (FSH) as positive predictors of NOA, while mean testicular volume (MTV) and inhibin B (INHB) were negatively correlated with NOA. Among nine machine learning methods evaluated, the Gradient Boosting Decision Trees achieved the highest performance with an area under the curve (AUC) of 0.974, whereas Random Forest showed the lowest AUC at 0.953. The nomogram model, incorporating these four factors, demonstrated robust predictive performance with AUCs of 0.984 in the training set and 0.976 in the validation set. Calibration and decision curve analysis confirmed the model's accuracy and clinical utility. Optimal cut-off points for biomarkers were identified: FSH at 7.50 IU/L (AUC = 0.96), INHB at 43.45 pg/ml (AUC = 0.95), MTV at 9.92 ml (AUC = 0.91), and semen pH at 6.95 (AUC = 0.71). The novel nomogram model incorporating FSH, INHB, MTV, and pH effectively predicts NOA in patients. This model offers a valuable tool for personalized diagnosis and management of azoospermia.
Project description:BackgroundSurveillance is universally recommended for non-small cell lung cancer (NSCLC) patients treated with curative-intent radiotherapy. High-quality evidence to inform optimal surveillance strategies is lacking. Machine learning demonstrates promise in accurate outcome prediction for a variety of health conditions. The purpose of this study was to utilise readily available patient, tumour, and treatment data to develop, validate and externally test machine learning models for predicting recurrence, recurrence-free survival (RFS) and overall survival (OS) at 2 years from treatment.MethodsA retrospective, multicentre study of patients receiving curative-intent radiotherapy for NSCLC was undertaken. A total of 657 patients from 5 hospitals were eligible for inclusion. Data pre-processing derived 34 features for predictive modelling. Combinations of 8 feature reduction methods and 10 machine learning classification algorithms were compared, producing risk-stratification models for predicting recurrence, RFS and OS. Models were compared with 10-fold cross validation and an external test set and benchmarked against TNM-stage and performance status. Youden Index was derived from validation set ROC curves to distinguish high and low risk groups and Kaplan-Meier analyses performed.FindingsMedian follow-up time was 852 days. Parameters were well matched across training-validation and external test sets: Mean age was 73 and 71 respectively, and recurrence, RFS and OS rates at 2 years were 43% vs 34%, 54% vs 47% and 54% vs 47% respectively. The respective validation and test set AUCs were as follows: 1) RFS: 0·682 (0·575-0·788) and 0·681 (0·597-0·766), 2) Recurrence: 0·687 (0·582-0·793) and 0·722 (0·635-0·81), and 3) OS: 0·759 (0·663-0·855) and 0·717 (0·634-0·8). Our models were superior to TNM stage and performance status in predicting recurrence and OS.InterpretationThis robust and ready to use machine learning method, validated and externally tested, sets the stage for future clinical trials entailing quantitative personalised risk-stratification and surveillance following curative-intent radiotherapy for NSCLC.FundingA full list of funding bodies that contributed to this study can be found in the Acknowledgements section.
Project description:Background and aimsPrimary liver cancer (PLC) is a common malignancy with poor survival and requires long-term follow-up. Hence, nomograms need to be established to predict overall survival (OS) and cancer-specific survival (CSS) from different databases for patients with PLC.MethodsData of PLC patients were downloaded from Surveillance, Epidemiology, and End Results (SEER) and the Cancer Genome Atlas (TCGA) databases. The Kaplan Meier method and log-rank test were used to compare differences in OS and CSS. Independent prognostic factors for patients with PLC were determined by univariate and multivariate Cox regression analyses. Two nomograms were developed based on the result of the multivariable analysis and evaluated by calibration curves and receiver operating characteristic curves.ResultsOS and CSS nomograms were based on age, race, TNM stage, primary diagnosis, and pathologic stage. The area under the curve (AUC) was 0.777, 0.769, and 0.772 for 1-, 3- and 5-year OS. The AUC was 0.739, 0.729 and 0.780 for 1-, 3- and 5-year CSS. The performance of the two new models was then evaluated using calibration curves.ConclusionsWe systematically reviewed the prognosis of PLC and developed two nomograms. Both nomograms facilitate clinical application and may benefit clinical decision-making.
Project description:Lung cancer is the leading cause of cancer death globally, killing 1.8 million people yearly. Over 85% of lung cancer cases are non-small cell lung cancer (NSCLC). Lung cancer running in families has shown that some genes are linked to lung cancer. Genes associated with NSCLC have been found by next-generation sequencing (NGS) and genome-wide association studies (GWAS). Many papers, however, neglected the complex information about interactions between gene pairs. Along with its high cost, GWAS analysis has an obvious drawback of false-positive results. Based on the above problem, computational techniques are used to offer researchers alternative and complementary low-cost disease–gene association findings. To help find NSCLC-related genes, we proposed a new network-based machine learning method, named deepRW, to predict genes linked to NSCLC. We first constructed a gene interaction network consisting of genes that are related and irrelevant to NSCLC disease and used deep walk and graph convolutional network (GCN) method to learn gene–disease interactions. Finally, deep neural network (DNN) was utilized as the prediction module to decide which genes are related to NSCLC. To evaluate the performance of deepRW, we ran tests with 10-fold cross-validation. The experimental results showed that our method greatly exceeded the existing methods. In addition, the effectiveness of each module in deepRW was demonstrated in comparative experiments.
Project description:BackgroundEarly-stage non-small cell lung cancer (NSCLC) is being diagnosed increasingly, and in 30% of diagnosed patients, recurrence will develop within 5 years. Thus, it is urgent to identify recurrence-related markers to optimize the management of patient-tailored therapeutics.MethodsThe eligible datasets were downloaded from TCGA and GEO. In the discovery phase, two algorithms, least absolute shrinkage and selector operation and support vector machine-recursive feature elimination, were used to identify candidate genes. The recurrence-associated signature was developed by penalized Cox regression. The nomogram was constructed and further tested via other independent cohorts.ResultsIn this retrospective study, 14 eligible datasets and 7 published signatures were included. A 13-gene based signature was generated by penalized Cox regression categorized training cohort into high-risk and low-risk subgroups (HR = 8.873, 95% CI: 4.228-18.480 p < 0.001). Furthermore, a nomogram integrating the recurrence-related signature, age, and histology was developed to predict the recurrence-free survival in the training cohort, which performed well in the two external validation cohorts (concordance index: 0.737, 95% CI: 0.732-0.742, p < 0.001; 0.666, 95% CI: 0.650-0.682, p < 0.001; 0.651, 95% CI: 0.637-0.665, p < 0.001, respectively). The nomogram was further performed well in the Jiangsu cohort enrolled 163 patients (HR = 2.723, 95% CI: 1.526-4.859, p = 0.001). Post-operative adjuvant therapy achieved evaluated disease-free survival in high and intermediate risk groups (HR = 4.791, 95% CI: 1.081-21.231, p = 0.039).ConclusionsThe proposed nomogram is a promising tool for estimating recurrence-free survival in stage I NSCLC, which might have tremendous value in management of early stage NSCLC and guiding adjuvant therapy strategies.
Project description:Backgroundthe objective of this study is to evaluate the predictive power of the survival model using deep learning of diffusion-weighted images (DWI) in patients with non-small-cell lung cancer (NSCLC).MethodsDWI at b-values of 0, 100, and 700 sec/mm2 (DWI0, DWI100, DWI700) were preoperatively obtained for 100 NSCLC patients who underwent curative surgery (57 men, 43 women; mean age, 62 years). The ADC0-100 (perfusion-sensitive ADC), ADC100-700 (perfusion-insensitive ADC), ADC0-100-700, and demographic features were collected as input data and 5-year survival was collected as output data. Our survival model adopted transfer learning from a pre-trained VGG-16 network, whereby the softmax layer was replaced with the binary classification layer for the prediction of 5-year survival. Three channels of input data were selected in combination out of DWIs and ADC images and their accuracies and AUCs were compared for the best performance during 10-fold cross validation.Results66 patients survived, and 34 patients died. The predictive performance was the best in the following combination: DWI0-ADC0-100-ADC0-100-700 (accuracy: 92%; AUC: 0.904). This was followed by DWI0-DWI700-ADC0-100-700, DWI0-DWI100-DWI700, and DWI0-DWI0-DWI0 (accuracy: 91%, 81%, 76%; AUC: 0.889, 0.763, 0.711, respectively). Survival prediction models trained with ADC performed significantly better than the one trained with DWI only (p-values < 0.05). The survival prediction was improved when demographic features were added to the model with only DWIs, but the benefit of clinical information was not prominent when added to the best performing model using both DWI and ADC.ConclusionsDeep learning may play a role in the survival prediction of lung cancer. The performance of learning can be enhanced by inputting precedented, proven functional parameters of the ADC instead of the original data of DWIs only.
Project description:ObjectivesPrognostication of neurologic status among survivors of in-hospital cardiac arrests remains a challenging task for physicians. Although models such as the Cardiac Arrest Survival Post-Resuscitation In-hospital score are useful for predicting neurologic outcomes, they were developed using traditional statistical techniques. In this study, we derive and compare the performance of several machine learning models with each other and with the Cardiac Arrest Survival Post-Resuscitation In-hospital score for predicting the likelihood of favorable neurologic outcomes among survivors of resuscitation.DesignAnalysis of the Get With The Guidelines-Resuscitation registry.SettingSeven-hundred fifty-five hospitals participating in Get With The Guidelines-Resuscitation from January 1, 2001, to January 28, 2017.PatientsAdult in-hospital cardiac arrest survivors.InterventionsNone.Measurements and main resultsOf 117,674 patients in our cohort, 28,409 (24%) had a favorable neurologic outcome, as defined as survival with a Cerebral Performance Category score of less than or equal to 2 at discharge. Using patient characteristics, pre-existing conditions, prearrest interventions, and periarrest variables, we constructed logistic regression, support vector machines, random forests, gradient boosted machines, and neural network machine learning models to predict favorable neurologic outcome. Events prior to October 20, 2009, were used for model derivation, and all subsequent events were used for validation. The gradient boosted machine predicted favorable neurologic status at discharge significantly better than the Cardiac Arrest Survival Post-Resuscitation In-hospital score (C-statistic: 0.81 vs 0.73; p < 0.001) and outperformed all other machine learning models in terms of discrimination, calibration, and accuracy measures. Variables that were consistently most important for prediction across all models were duration of arrest, initial cardiac arrest rhythm, admission Cerebral Performance Category score, and age.ConclusionsThe gradient boosted machine algorithm was the most accurate for predicting favorable neurologic outcomes in in-hospital cardiac arrest survivors. Our results highlight the utility of machine learning for predicting neurologic outcomes in resuscitated patients.
Project description:BackgroundOxidative stress process plays a key role in aging and cancer; however, currently, there is paucity of machine-learning model studies investigating the relationship between oxidative stress and prognosis of elderly patients with esophageal squamous cancer (ESCC).MethodsThis study included elderly patients with ESCC who underwent curative ESCC resection surgery continuously from January 2013 to December 2020 and were stratified into the training and external validation cohorts. Using Cox stepwise regression analysis based on Akaike information criterion, the relationship between oxidative stress biomarkers and prognosis was explored, and a geriatric ESCC-related oxidative stress score (OSS) was constructed. To construct a predictive model for 3-year overall survival (OS), machine-learning strategies including decision tree (DT), random forest (RF), and support vector machine (SVM) were employed. These machine-learning strategies play a key role in data mining and pattern recognition tasks. Each model was tested in the external validation cohort through 1000 resampling iterations. Validation was conducted using receiver operating characteristic area under the curve (AUC) and calibration plots.ResultsThe training cohort and validation cohort consisted of 340 and 145 patients, respectively. In the training cohort, the 3-year OS rate for patients was 59.2%. We constructed the OSS based on systemic oxidative stress biomarkers using the training cohort. The study found that pathological N stage, pathological T stage, tumor histological type, lymphovascular invasion, CEA, OSS, CA 19 - 9, and the amount of bleeding were the most important factors influencing the 3-year OS. These eight important features were included in training the RF, DT, and SVM and trained on the training cohort and validated cohort, respectively. In the training cohort, the RF model demonstrated the highest predictive performance with an AUC of 0.975 (0.962-0.987), while the DT model is 0.784 (0.739-0.830) and the SVM is 0.879 (0.843-0.916). In the external validation cohort, the RF model again exhibited the highest performance with an AUC of 0.791 (0.717-0.864), compared to the DT model with an AUC of 0.717 (0.640-0.794) and 0.779 (0.702-0.856) in SVM.ConclusionsThe random forest clinical prediction model constructed based on OSS can effectively predict the prognosis of elderly patients with ESCC after curative surgery.
Project description:The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components.