Project description:RationaleMore targeted management of severe acute pediatric asthma could improve clinical outcomes.ObjectivesTo identify distinct clinical phenotypes of severe acute pediatric asthma using variables obtained in the first 12 h of hospitalization.MethodsWe conducted a retrospective cohort study in a quaternary care children's hospital from 2014 to 2022. Encounters for children ages 2-18 years admitted to the hospital for asthma were included. We used consensus k means clustering with patient demographics, vital signs, diagnostics, and laboratory data obtained in the first 12 h of hospitalization.Measurements and main resultsThe study population included 683 encounters divided into derivation (80%) and validation (20%) sets, and two distinct clusters were identified. Compared to Cluster 1 in the derivation set, Cluster 2 encounters (177 [32%]) were older (11 years [8; 14] vs. 5 years [3; 8]; p < .01) and more commonly males (63% vs. 53%; p = .03) of Black race (51% vs. 40%; p = .03) with non-Hispanic ethnicity (96% vs. 84%; p < .01). Cluster 2 encounters had smaller improvements in vital signs at 12-h including percent change in heart rate (-1.7 [-11.7; 12.7] vs. -7.8 [-18.5; 1.7]; p < .01), and respiratory rate (0.0 [-20.0; 22.2] vs. -11.4 [-27.3; 9.0]; p < .01). Encounters in Cluster 2 had lower percentages of neutrophils (70.0 [55.0; 83.0] vs. 85.0 [77.0; 90.0]; p < .01) and higher percentages of lymphocytes (17.0 [8.0; 32.0] vs. 9.0 [5.3; 14.0]; p < .01). Cluster 2 encounters had higher rates of invasive mechanical ventilation (23% vs. 5%; p < .01), longer hospital length of stay (4.5 [2.6; 8.8] vs. 2.9 [2.0; 4.3]; p < .01), and a higher mortality rate (7.3% vs. 0.0%; p < .01). The predicted cluster assignments in the validation set shared the same ratio (~2:1), and many of the same characteristics.ConclusionsWe identified two clinical phenotypes of severe acute pediatric asthma which exhibited distinct clinical features and outcomes.
Project description:Diabetic nephropathy (DN), a multifaceted disease with various contributing factors, presents challenges in understanding its underlying causes. Uncovering biomarkers linked to this condition can shed light on its pathogenesis and support the creation of new diagnostic and treatment methods. Gene expression data were sourced from accessible public databases, and Weighted Gene Co-expression Network Analysis (WGCNA)was employed to pinpoint gene co-expression modules relevant to DN. Subsequently, various machine learning techniques, such as random forest, lasso regression algorithm (LASSO), and support vector machine-recursive feature elimination (SVM-REF), were utilized for distinguishing DN cases from controls using the identified gene modules. Additionally, functional enrichment analyses were conducted to explore the biological roles of these genes. Our analysis revealed 131 genes showing distinct expression patterns between controlled and uncontrolled groups. During the integrated WCGNA, we identified 61 co-expressed genes encompassing both categories. The enrichment analysis highlighted involvement in various immune responses and complex activities. Techniques like Random Forest, LASSO, and SVM-REF were applied to pinpoint key hub genes, leading to the identification of VWF and DNASE1L3. In the context of DN, they demonstrated significant consistency in both expression and function. Our research uncovered potential biomarkers for DN through the application of WGCNA and various machine learning methods. The results indicate that 2 central genes could serve as innovative diagnostic indicators and therapeutic targets for this disease. This discovery offers fresh perspectives on the development of DN and could contribute to the advancement of new diagnostic and treatment approaches.
Project description:BackgroundThe genetic factors and pathogenesis of idiopathic dilated cardiomyopathy-induced heart failure (IDCM-HF) have not been understood thoroughly; there is a lack of specific diagnostic markers and treatment methods for the disease. Hence, we aimed to identify the mechanisms of action at the molecular level and potential molecular markers for this disease.MethodsGene expression profiles of IDCM-HF and non-heart failure (NF) specimens were acquired from the database of Gene Expression Omnibus (GEO). We then identified the differentially expressed genes (DEGs) and analyzed their functions and related pathways by using "Metascape". Weighted gene co-expression network analysis (WGCNA) was utilized to search for key module genes. Candidate genes were identified by intersecting the key module genes identified via WGCNA with DEGs and further screened via the support vector machine-recursive feature elimination (SVM-RFE) method and the least absolute shrinkage and selection operator (LASSO) algorithm. At last, the biomarkers were validated and evaluated the diagnostic efficacy by the area under curve (AUC) value and further confirmed the differential expression in the IDCM-HF and NF groups using an external database.ResultsWe detected 490 genes exhibiting differential expression between IDCM-HF and NF specimens from the GSE57338 dataset, with most of them being concentrated in the extracellular matrix (ECM) of cells related to biological processes and pathways. After screening, 13 candidate genes were identified. Aquaporin 3 (AQP3) and cytochrome P450 2J2 (CYP2J2) showed high diagnostic efficacy in the GSE57338 and GSE6406 datasets, respectively. In comparison to the NF group, AQP3 was significantly down-regulated in the IDCM-HF group, while CYP2J2 was significantly up-regulated.ConclusionAs far as we know, this is the first study that combines WGCNA and machine learning algorithms to screen for potential biomarkers of IDCM-HF. Our findings suggest that AQP3 and CYP2J2 could be used as novel diagnostic markers and treatment targets of IDCM-HF.
Project description:Women with uncomplicated urinary tract infection (UTI) symptoms are commonly treated with empirical antibiotics, resulting in overuse of antibiotics, which promotes antimicrobial resistance. Available diagnostic tools are either not cost-effective or diagnostically sub-optimal. Here, we identified clinical and urinary immunological predictors for UTI diagnosis. We explored 17 clinical and 42 immunological potential predictors for bacterial culture among women with uncomplicated UTI symptoms using random forest or support vector machine coupled with recursive feature elimination. Urine cloudiness was the best performing clinical predictor to rule out (negative likelihood ratio [LR-] = 0.4) and rule in (LR+ = 2.6) UTI. Using a more discriminatory scale to assess cloudiness (turbidity) increased the accuracy of UTI prediction further (LR+ = 4.4). Urinary levels of MMP9, NGAL, CXCL8 and IL-1β together had a higher LR+ (6.1) and similar LR- (0.4), compared to cloudiness. Varying the bacterial count thresholds for urine culture positivity did not alter best clinical predictor selection, but did affect the number of immunological predictors required for reaching an optimal prediction. We conclude that urine cloudiness is particularly helpful in ruling out negative UTI cases. The identified urinary biomarkers could be used to develop a point of care test for UTI but require further validation.
Project description:Cognitive dysfunction caused by diabetes has become a serious global medical issue. Diabetic kidney disease (DKD) exacerbates cognitive dysfunction in patients, although the precise mechanism behind this remains unclear. Here, we conducted an investigation using RNA sequencing data from the Gene Expression Omnibus (GEO) database. We analyzed the differentially expressed genes in DKD and three types of neurons in the temporal cortex (TC) of diabetic patients with cognitive dysfunction. Through our analysis, we identified a total of 133 differentially expressed genes (DEGs) shared between DKD and TC neurons (62 up-regulated and 71 down-regulated). To identify potential common biomarkers, we employed machine learning algorithms (LASSO and SVM-RFE) and Venn diagram analysis. Ultimately, we identified 8 overlapping marker genes (ZNF564, VPS11, YPEL4, VWA5B1, A2ML1, KRT6A, SEC14L1P1, SH3RF1) as potential biomarkers, which exhibited high sensitivity and specificity in ROC curve analysis. Functional analysis using Gene Ontology (GO) revealed that these genes were primarily enriched in autophagy, ubiquitin/ubiquitin-like protein ligase activity, MAP-kinase scaffold activity, and syntaxin binding. Further enrichment analysis using Gene Set Enrichment Analysis (GSEA) and Gene Set Variation Analysis (GSVA) indicates that these biomarkers may play a crucial role in the development of cognitive dysfunction and diabetic nephropathy. Building upon these biomarkers, we developed a diagnostic model with a reliable predictive ability for DKD complicated by cognitive dysfunction. To validate the 8 biomarkers, we conducted RT-PCR analysis in the cortex, hippocampus and kidney of animal models. The results demonstrated the up-regulation of SH3RF1 in the cortex, hippocampus and kidney of mice, which was further confirmed by immunofluorescence and Western blot validation. Notably, SH3RF1 is a scaffold protein involved in cell survival in the JNK signaling pathway. Based on these findings, we support that SH3RF1 may be a common gene expression feature that influences DKD and cognitive dysfunction through the apoptotic pathway.
Project description:PurposeThis study aims to identify potential myopia biomarkers using machine learning algorithms, enhancing myopia diagnosis and prognosis prediction.MethodsGSE112155 and GSE15163 datasets from the GEO database were analyzed. We used "limma" for differential expression analysis and "GO plot" and "clusterProfiler" for functional and pathway enrichment analyses. The LASSO and SVM-RFE algorithms were employed to screen myopia-related biomarkers, followed by ROC curve analysis for diagnostic performance evaluation. Single-gene GSEA enrichment analysis was executed using GSEA 4.1.0.ResultsThe functional analysis of differentially expressed genes indicated their role in carbohydrate generation and polysaccharide synthesis. We identified 23 differentially expressed genes associated with myopia, four of which were highly effective diagnostic biomarkers. Single gene GSEA results showed these genes control the ubiquitin-mediated protein hydrolysis pathway.ConclusionOur study identifies four key myopia biomarkers, providing a foundation for future clinical and experimental validation studies.
Project description:BackgroundThe integration of machine learning (ML) in predicting asthma-related outcomes in children presents a novel approach in pediatric health care.ObjectiveThis scoping review aims to analyze studies published since 2019, focusing on ML algorithms, their applications, and predictive performances.MethodsWe searched Ovid MEDLINE ALL and Embase on Ovid, the Cochrane Library (Wiley), CINAHL (EBSCO), and Web of Science (core collection). The search covered the period from January 1, 2019, to July 18, 2023. Studies applying ML models in predicting asthma-related outcomes in children aged <18 years were included. Covidence was used for citation management, and the risk of bias was assessed using the Prediction Model Risk of Bias Assessment Tool.ResultsFrom 1231 initial articles, 15 met our inclusion criteria. The sample size ranged from 74 to 87,413 patients. Most studies used multiple ML techniques, with logistic regression (n=7, 47%) and random forests (n=6, 40%) being the most common. Key outcomes included predicting asthma exacerbations, classifying asthma phenotypes, predicting asthma diagnoses, and identifying potential risk factors. For predicting exacerbations, recurrent neural networks and XGBoost showed high performance, with XGBoost achieving an area under the receiver operating characteristic curve (AUROC) of 0.76. In classifying asthma phenotypes, support vector machines were highly effective, achieving an AUROC of 0.79. For diagnosis prediction, artificial neural networks outperformed logistic regression, with an AUROC of 0.63. To identify risk factors focused on symptom severity and lung function, random forests achieved an AUROC of 0.88. Sound-based studies distinguished wheezing from nonwheezing and asthmatic from normal coughs. The risk of bias assessment revealed that most studies (n=8, 53%) exhibited low to moderate risk, ensuring a reasonable level of confidence in the findings. Common limitations across studies included data quality issues, sample size constraints, and interpretability concerns.ConclusionsThis review highlights the diverse application of ML in predicting pediatric asthma outcomes, with each model offering unique strengths and challenges. Future research should address data quality, increase sample sizes, and enhance model interpretability to optimize ML utility in clinical settings for pediatric asthma management.
Project description:Schizophrenia is a group of severe neurodevelopmental disorders. Identification of peripheral diagnostic biomarkers is an effective approach to improving diagnosis of schizophrenia. In this study, four datasets of schizophrenia patients' blood or serum samples were downloaded from the GEO database and merged and de-batched for the analyses of differentially expressed genes (DEGs) and weighted gene co-expression network analysis (WCGNA). The WGCNA analysis showed that the cyan module, among 9 modules, was significantly related to schizophrenia, which subsequently yielded 317 schizophrenia-related key genes by comparing with the DEGs. The enrichment analyses on these key genes indicated a strong correlation with immune-related processes. The CIBERSORT algorithm was adopted to analyze immune cell infiltration, which revealed differences in eosinophils, M0 macrophages, resting mast cells, and gamma delta T cells. Furthermore, by comparing with the immune genes obtained from online databases, 95 immune-related key genes for schizophrenia were screened out. Moreover, machine learning algorithms including Random Forest, LASSO, and SVM-RFE were used to further screen immune-related hub genes of schizophrenia. Finally, CLIC3 was found as an immune-related hub gene of schizophrenia by the three machine learning algorithms. A schizophrenia rat model was established to validate CLIC3 expression and found that CLIC3 levels were reduced in the model rat plasma and brains in a brain-regional dependent manner, but can be reversed by an antipsychotic drug risperidone. In conclusion, using various bioinformatic and biological methods, this study found an immune-related hub gene of schizophrenia - CLIC3 that might be a potential diagnostic biomarker and therapeutic target for schizophrenia.
Project description:BackgroundChronic lymphocytic leukemia (CLL) is the most common type of leukemia in adults. Thus, novel reliable biomarkers need to be further explored to increase diagnostic, therapeutic, and prognostic effectiveness.MethodsSix datasets containing CLL and control samples were downloaded from the Gene Expression Omnibus database. Differential gene expression analysis, weighted gene coexpression network analysis (WGCNA), and the least absolute shrinkage and selection operator (LASSO) regression were applied to identify potential diagnostic biomarkers for CLL using R software. The diagnostic performance of the hub genes was then measured by the receiver operating characteristic (ROC) curve analysis. Functional analysis was implemented to uncover the underlying mechanisms. Additionally, correlation analysis was performed to assess the relationship between the hub genes and immunity characteristics.ResultsA total number of 47 differentially expressed genes (DEGs) and 25 candidate hub genes were extracted through differential gene expression analysis and WGCNA, respectively. Based on the 14 overlapped genes between the DEGs and the candidate hub genes, LASSO regression analysis was used, which identified a final number of six hub genes as potential biomarkers for CLL: ABCA6, CCDC88A, PMEPA1, EBF1, FILIP1L, and TEAD2. The ROC curves of the six genes showed reliable predictive ability in the training and validation cohorts, with all area under the curve (AUC) values over 0.80. Functional analysis revealed an abnormal immune status in the CLL patients. A significant correlation was found between the hub genes and the immune-related pathways, indicating a possible tight connection between the hub genes and tumor immunity in CLL.ConclusionThis study was based on machine learning algorithms, and we identified six genes that could be possible CLL markers, which may be involved in CLL pathogenesis and progression through immune-related signal pathways.
Project description:ObjectiveThe cause and mechanism of non-obstructive azoospermia (NOA) is complicated; therefore, an effective therapy strategy is yet to be developed. This study aimed to analyse the pathogenesis of NOA at the molecular biological level and to identify the core regulatory genes, which could be utilised as potential biomarkers.MethodsThree NOA microarray datasets (GSE45885, GSE108886, and GSE145467) were collected from the GEO database and merged into training sets; a further dataset (GSE45887) was then defined as the validation set. Differential gene analysis, consensus cluster analysis, and WGCNA were used to identify preliminary signature genes; then, enrichment analysis was applied to these previously screened signature genes. Next, 4 machine learning algorithms (RF, SVM, GLM, and XGB) were used to detect potential biomarkers that are most closely associated with NOA. Finally, a diagnostic model was constructed from these potential biomarkers and visualised as a nomogram. The differential expression and predictive reliability of the biomarkers were confirmed using the validation set. Furthermore, the competing endogenous RNA network was constructed to identify the regulatory mechanisms of potential biomarkers; further, the CIBERSORT algorithm was used to calculate immune infiltration status among the samples.ResultsA total of 215 differentially expressed genes (DEGs) were identified between NOA and control groups (27 upregulated and 188 downregulated genes). The WGCNA results identified 1123 genes in the MEblue module as target genes that are highly correlated with NOA positivity. The NOA samples were divided into 2 clusters using consensus clustering; further, 1027 genes in the MEblue module, which were screened by WGCNA, were considered to be target genes that are highly correlated with NOA classification. The 129 overlapping genes were then established as signature genes. The XGB algorithm that had the maximum AUC value (AUC=0.946) and the minimum residual value was used to further screen the signature genes. IL20RB, C9orf117, HILS1, PAOX, and DZIP1 were identified as potential NOA biomarkers. This 5 biomarker model had the highest AUC value, of up to 0.982, compared to other single biomarker models; additionally, the results of this biomarker model were verified in the validation set.ConclusionsAs IL20RB, C9orf117, HILS1, PAOX, and DZIP1 have been determined to possess the strongest association with NOA, these five genes could be used as potential therapeutic targets for NOA patients. Furthermore, the model constructed using these five genes, which possessed the highest diagnostic accuracy, may be an effective biomarker model that warrants further experimental validation.