Project description:Cellular senescence is a state of irreversible cell cycle arrest that contributes to age-associated decline through the accumulation of senescent cells and their senescence-associated secretory phenotype (SASP). The SASP, comprising inflammatory signaling molecules and growth factors, can induce secondary senescence in surrounding healthy cells via paracrine signaling. Additionally, secondary senescence can be induced through juxtacrine signaling via activation of NOTCH1. To further understand the progression of cells into primary and secondary senescence, we used a comprehensive single-cell RNA sequencing data set of clonal human lung fibroblasts (LF1) cell lines from different forms of senescence and quiescence. Here, we present SENTRY (SENescent TRacking sYstem), a new method that uses unsupervised clustering techniques to identify subpopulations of cells common to most major forms of senescence, revealing that the RNA profiles of these subpopulations are driven in part by markers associated with secondary senescence. Leveraging this data, we developed machine learning models using random forests to predict senescent status and subtype with exceptionally high accuracy. SHAP (SHapley Additive exPlanations) analysis identified the most informative genes for classification, many of which are unique to our model compared to existing senescent signatures like SenMayo and SenSig. We then used this classification to analyze single-cell RNA sequencing data in a time course of proliferating and senescent human lung fibroblasts. We observed that primary and secondary senescent cells exhibit distinct transcriptomic and epigenetic profiles, particularly in pathways related to cell cycle regulation, extracellular matrix remodeling and inflammatory signaling. Additionally, we performed a comparative analysis of gene-length-dependent transcription decline (GLTD) at different stages of senescence induction.
Project description:Gene expression profiles were generated from 199 primary breast cancer patients. Samples 1-176 were used in another study, GEO Series GSE22820, and form the training data set in this study. Sample numbers 200-222 form a validation set. This data is used to model a machine learning classifier for Estrogen Receptor Status. RNA was isolated from 199 primary breast cancer patients. A machine learning classifier was built to predict ER status using only three gene features.
Project description:RNA-sequencing (RNA-seq) is widely used for analysis of alternative splicing, but in practice, has inherent biases which hinder its ability to detect and quantify splicing events. To address this, we present a targeted RNA-seq method that specifically enriches for splicing-informative junction-spanning reads. Local Splicing Variation sequencing (LSV-seq) utilizes multiplexed reverse transcription from highly scalable pools of primers anchored near splice junctions of interest. Primers are designed using Optimal Prime, a novel dedicated machine learning algorithm trained on the performance of thousands of primer sequences. LSV-seq achieves high on-target capture rates and concordance with RNA-seq, while requiring several-fold lower sequencing depth. We use LSV-seq to target events with low coverage in GTEx RNA-seq data and discover hundreds of previously hidden tissue-specific splicing events. Our results demonstrate the ability of LSV-seq to capture alternative splicing with exceptional sensitivity and highlight its potential to improve the detection of other RNA features of interest.
Project description:Leaf senescence is a tightly controlled and complex developmental process that shares many similarities across species, yet our understanding of the underlying conserved molecular mechanisms is still lacking. Here, we observed functional conservation of leaf senescence underlying pathways in A. thaliana, O. sativa, and S. lycopersicum. From machine learning-based integration of data from nearly 10 000 samples to obtain a universal regulatory network of leaf senescence, it was found that mitostasis is the cross-species central biological hub. We measure and compare changes in the transcriptome and metabolome of A. thaliana, O. sativa, and S. lycopersicum leaves under mitostress/natural senescence. In data from different species, mitostasis-related transcription factors binding site enrichment and amino acids expression changes converge on putative senescence modulators. Our study provides a cross-species, multi-omics perspective for understanding the leaf senescence conserved mechanisms.
Project description:Background - Senescence classification is an acknowledged challenge within the field, as markers are cell-type and context dependent. Currently, multiple morphological and immunofluorescence markers are required. However, emerging scRNA-seq datasets have enabled increased understanding of senescent cell heterogeneity. Methods - Here we present SenPred, a machine-learning pipeline which identifies fibroblast senescence based on single-cell transcriptomics from fibroblasts grown in 2D and 3D. Results - Using scRNA-seq of both 2D and 3D deeply senescent fibroblasts, the model predicts intra-experimental fibroblast senescence to a high degree of accuracy (>99% true positives). Applying SenPred to in vivo whole skin scRNA-seq datasets reveals that cells grown in 2D cannot accurately detect fibroblast senescence in vivo. Importantly, utilising scRNA-seq from 3D deeply senescent fibroblasts refines our ML model leading to improved detection of senescent cells in vivo. This is context specific, with the SenPred pipeline proving effective when detecting senescent human dermal fibroblasts in vivo, but not senescence of lung fibroblasts or whole skin. Conclusions - We position this as a proof-of-concept study based on currently available scRNA-seq datasets, with the intention to build a holistic model to detect multiple senescent triggers using future emerging datasets. The development of SenPred has allowed for detection of an in vivo senescent fibroblast burden in human skin, which could have broader implications for the treatment of age-related morbidities.
Project description:To achieve the best outcomes, breast cancer necessitates robust strategies for early detection. However, reliable blood-based tests for identifying early-stage disease remains elusive. Here we have employed plasma metabolomics and machine learning techniques to establish a non-invasive metabolic approach for early detection of breast cancer.
Project description:We introduce MSTracer, a tool for peptide feature detection from MS1, which incorporates a machine-learning-combined scoring function based on peptide isotopic distribution and peptide intensity shape on the LC-MS map. By using Support Vector Regression (SVR), the quality of detected peptide features is remarkably improved. By utilising Neural Networks (NN), scores that indicate the quality of features are assigned for detected features as well. We use the Human HELA LC-MSMS dataset to train and test the results and compare with MaxQuant, OpenMS, and Dinosaur.