Project description:The identification of gene regulatory modules is an important yet challenging problem in computational biology. While many computational methods have been proposed to identify regulatory modules, their initial success is largely compromised by a high rate of false positives, especially when applied to human cancer studies. New strategies are needed for reliable regulatory module identification. We present a new approach, namely multi-level support vector regression (ml-SVR), to systematically identify conditionspecific regulatory modules. The approach is built upon a multi-level analysis strategy designed for suppressing false positive predictions. With this strategy, a regulatory module becomes ever more significant as more relevant gene sets are formed at finer levels. At each level, a two-stage support vector regression (SVR) method is utilized to help reduce false positive predictions by integrating binding motif information and gene expression data; a significant analysis procedure is followed to assess the significance of each regulatory module. We applied our method to breast cancer cell line data to identify condition-specific regulatory modules associated with estrogen treatment. Experimental results show that our method can identify biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer.
Project description:The identification of gene regulatory modules is an important yet challenging problem in computational biology. While many computational methods have been proposed to identify regulatory modules, their initial success is largely compromised by a high rate of false positives, especially when applied to human cancer studies. New strategies are needed for reliable regulatory module identification. We present a new approach, namely multi-level support vector regression (ml-SVR), to systematically identify conditionspecific regulatory modules. The approach is built upon a multi-level analysis strategy designed for suppressing false positive predictions. With this strategy, a regulatory module becomes ever more significant as more relevant gene sets are formed at finer levels. At each level, a two-stage support vector regression (SVR) method is utilized to help reduce false positive predictions by integrating binding motif information and gene expression data; a significant analysis procedure is followed to assess the significance of each regulatory module. We applied our method to breast cancer cell line data to identify condition-specific regulatory modules associated with estrogen treatment. Experimental results show that our method can identify biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. Three independent total RNA samples were extracted for each cell line (MCF-7 and MCF-7-stripped) and the samples were arrayed using Affymetrix GeneChip HG-U133A. MCF-7-stripped denotes estrogen-deprived MCF-7 human breast cancer cells, which were grown in the absence of estrogen for 96 hours. We analyzed the enriched motifs and their targets for the genes significantly down-regulated in MCF-7-stripped cells as compared to MCF-7 cells.
Project description:Multiple organ dysfunction syndrome (MODS) can result from a variety of initiating events such as infection or trauma. The clinical condition of some MODS patients may deteriorate and require intense resource and high-risk cardiopulmonary support via extracorporeal membrane oxygenation (ECMO). Until now, no diagnostic criteria/molecular biomarker has been developed to identify MODS patients who require subsequent ECMO support. We used multi-time point (0h, 72h and 8d) whole transcriptomics from total blood of 27 patients (contro-4, MODS-17 and ECMO-6) to derived the molecular signatures to diagnose the MODS patients required ECMO support. We observed that immune response (neutrophil level) was compromised in MODS patients, who required ECMO support. Differential gene expression analysis and gene ontology enrichment has revealed that epigenetic modifications has got activated during the MODS deterioration to ECMO. In addition, signature of 6 genes were identified using logistic regression, which can be used as putative diagnostic markers for patients needed ECMO support.
Project description:LsRP based on amino acid locus information and Support Vector Regression can be used to predict peptide retention time in LC-MS experiments.
Project description:We introduce MSTracer, a tool for peptide feature detection from MS1, which incorporates a machine-learning-combined scoring function based on peptide isotopic distribution and peptide intensity shape on the LC-MS map. By using Support Vector Regression (SVR), the quality of detected peptide features is remarkably improved. By utilising Neural Networks (NN), scores that indicate the quality of features are assigned for detected features as well. We use the Human HELA LC-MSMS dataset to train and test the results and compare with MaxQuant, OpenMS, and Dinosaur.
Project description:The goal of this study was to identify potential AMH-induced genes and regulatory networks controlling regression by RNA-Seq transcriptome analysis of differences in Müllerian Duct mesenchyme between males (AMH signaling on) and females (AMH signaling off) in purified fetal Müllerian Duct mesenchymal cells. This analysis found 82 genes up-regulated in males during MD regression and identified Osterix (Osx)/Sp7, a key transcriptional regulator of osteoblast differentiation and bone formation, as a novel downstream effector of AMH signaling during MD regression.
Project description:Background:
To assist clinicians with diagnosis and optimal treatment decision-making, we attempted to develop and validate an artificial intelligence prediction model for lung metastasis (LM) in colorectal cancer (CRC) patients.
Method:
The clinicopathological characteristics of 46037 CRC patients from the Surveillance, Epidemiology, and End Results (SEER) database and 2779 CRC patients from a multi-center external validation set were collected retrospectively. After feature selection by univariate and multivariate analyses, six machine learning (ML) models, including logistic regression, K-nearest neighbor, support vector machine, decision tree, random forest, and balanced random forest (BRF), were developed and validated for the LM prediction. The optimization model with best performance was compared to the clinical predictor. In addition, stratified LM patients by risk score were utilized for survival analysis.
Project description:Study Objective: Identify small molecule biomarkers of insufficient sleep using untargeted plasma metabolomics in humans undergoing experimental insufficient sleep. Methods: We conducted a cross-over laboratory study where 16 normal weight participants (8 men; age 22 ± 5 years; body mass index < 25 kg/m2) completed three baseline days (BL; 9h sleep opportunity per night) followed by five day insufficient (5H; 5h sleep opportunity per night) and adequate (9H; 9h sleep opportunity per night) sleep conditions. Energy balanced diets were provided during baseline, with ad libitum energy intake provided during the insufficient and adequate sleep conditions. Untargeted plasma metabolomics analyses were performed using blood samples collected every 4h across the final 24h of each condition. Biomarker models were developed using logistic regression and linear support vector machine algorithms. Results: The top performing biomarker model was developed by linear support vector machine modeling, consisted of 65 compounds, and discriminated insufficient versus adequate sleep with 74% overall accuracy and a Matthew’s Correlation Coefficient of 0.39. The compounds in the top performing biomarker model were associated with ATP Binding Cassette Transporters in Lipid Homeostasis, Phospholipid Metabolic Process, Plasma Lipoprotein Remodeling, and sphingolipid metabolism. Conclusion: We identified potential metabolomics-based biomarkers of insufficient sleep in humans. Further development and validation of omics-based biomarkers of insufficient sleep will advance our understanding of the negative consequences of insufficient sleep, improve diagnosis of poor sleep health, and identify targets for countermeasures designed to mitigate the negative health consequences of insufficient sleep.
Project description:Cell type-specific gene expression patterns are outputs of transcriptional gene regulatory networks (GRNs) that dynamically reconfigure to drive diverse cellular states. Single-cell transcriptomic technologies, such as single cell RNA-sequencing (scRNA-seq) and single cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), can examine the transcriptional state at individual cell resolution, allowing the study of cellular heterogeneity and cell-type specific gene regulation in dynamic processes such as cell fate specification. However, current approaches to infer cell type-specific gene regulatory networks from these datasets are limited in their ability to integrate scRNA-seq and scATACseq measurements and to model network dynamics on a cell lineage. To address this challenge, we have developed single-cell Multi-Task Network Inference (scMTNI), a multi-task learning framework to infer the gene regulatory network for each cell type on a lineage from scRNA-seq and scATAC-seq data. Using simulated, published, and newly collected single cell omic datasets, we show that scMTNI is able to accurately infer gene regulatory networks and captures meaningful network dynamics that identify GRN components associated with cell type transitions. Application of our method to mouse cellular reprogramming identified key regulators associated with cell populations that reprogram versus those that are stalled. Taken together, scMTNI is a powerful framework to infer cell type-specific gene regulatory networks and their dynamics from scRNA-seq and scATAC-seq datasets.
Project description:This SuperSeries is composed of the SubSeries listed below.This contains multi-omics datasets transcriptomic (RNA-Seq), methylomic (WGBS), and epigenomics (ATAC-seq) obtained during onset of sexual maturation in Atlantic salmon. We used gene regulatory networks (GRNs) to integrate results from these multi-omic analyses to identify key regulators of maturation.