Robust Prognostic Subtyping of Muscle-Invasive Bladder Cancer Revealed by Deep Learning-Based Multi-Omics Data Integration.
ABSTRACT: Muscle-invasive bladder cancer (MIBC) is the most common urinary system carcinoma associated with poor outcomes. It is necessary to develop a robust classification system for prognostic prediction of MIBC. Recently, increasing omics data at different levels of MIBC were produced, but few integration methods were used to classify MIBC that reflects the patient's prognosis. In this study, we constructed an autoencoder based deep learning framework to integrate multi-omics data of MIBC and clustered samples into two different subgroups with significant overall survival difference (P = 8.11 × 10-5). As an independent prognostic factor relative to clinical information, these two subtypes have some significant genomic differences. Remarkably, the subtype of poor prognosis had significant higher frequency of chromosome 3p deletion. Immune decomposition analysis results showed that these two MIBC subtypes had different immune components including macrophages M1, resting NK cells, regulatory T cells, plasma cells, and naïve B cells. Hallmark gene set enrichment analysis was performed to investigate the functional character difference between these two MIBC subtypes, which revealed that activated IL-6/JAK/STAT3 signaling, interferon-alpha response, reactive oxygen species pathway, and unfolded protein response were significantly enriched in upregulated genes of high-risk subtype. We constructed MIBC subtyping models based on multi-omics data and single omics data, respectively, and internal and external validation datasets showed the robustness of the prediction model as well as its ability of prognosis (P < 0.05 in all datasets). Finally, through bioinformatics analysis and immunohistochemistry experiments, we found that KRT7 can be used as a biomarker reflecting MIBC risk.
Project description:High-risk neuroblastoma is a very aggressive disease, with excessive tumor growth and poor outcomes. A proper stratification of the high-risk patients by prognostic outcome is important for treatment. However, there is still a lack of survival stratification for the high-risk neuroblastoma. To fill the gap, we adopt a deep learning algorithm, Autoencoder, to integrate multi-omics data, and combine it with K-means clustering to identify two subtypes with significant survival differences. By comparing the Autoencoder with PCA, iCluster, and DGscore about the classification based on multi-omics data integration, Autoencoder-based classification outperforms the alternative approaches. Furthermore, we also validated the classification in two independent datasets by training machine-learning classification models, and confirmed its robustness. Functional analysis revealed that MYCN amplification was more frequently occurred in the ultra-high-risk subtype, in accordance with the overexpression of MYC/MYCN targets in this subtype. In summary, prognostic subtypes identified by deep learning-based multi-omics integration could not only improve our understanding of molecular mechanism, but also help the clinicians make decisions.
Project description:A major challenge for treating patients with pancreatic ductal adenocarcinoma (PDAC) is the unpredictability of their prognoses due to high heterogeneity. We present Multi-Omics DEep Learning for Prognosis-correlated subtyping (MODEL-P) to identify PDAC subtypes and to predict prognoses of new patients. MODEL-P was trained on autoencoder integrated multi-omics of 146 patients with PDAC together with their survival outcome. Using MODEL-P, we identified two PDAC subtypes with distinct survival outcomes (median survival 10.1 and 22.7 months, respectively, log rank p = 1 × 10-6), which correspond to DNA damage repair and immune response. We rigorously validated MODEL-P by stratifying patients in five independent datasets into these two survival groups and achieved significant survival difference, which is superior to current practice and other subtyping schemas. We believe the subtype-specific signatures would facilitate PDAC pathogenesis discovery, and MODEL-P can provide clinicians the prognoses information in the treatment decision-making to better gauge the benefits versus the risks.
Project description:Basal and luminal subtypes of muscle-invasive bladder cancer (MIBC) have distinct molecular profiles and heterogeneous clinical behaviors. The interactions between mRNAs and lncRNAs, which might be regulated by miRNAs, have crucial roles in many cancers. However, the miRNA-dependent crosstalk between lncRNA and mRNA in specific MIBC subtypes still remains unclear. In this study, we first classified MIBC into two conservative subtypes using miRNA, mRNA and lncRNA expression data derived from The Cancer Genome Atlas. Then we investigated subtype-related biological pathways and evaluated the subtype classification performance using Decision Trees, Random Forest and eXtreme Gradient Boosting (XGBoost). At last, we explored potential miRNA-mediated lncRNA-mRNA crosstalks based on co-expression analysis. Our results show that: (1) the luminal subtype is primarily characterized by upregulation of metabolism-related pathways while the basal subtype is predominantly characterized by upregulation of epithelial-mesenchymal transition, metastasis, and immune system process-related pathways; (2) the XGBoost prediction model is consistently robust for classification of the molecular subtypes of MIBC across four datasets (The area under the ROC curve > 0.9); (3) the expression levels of the molecules in the miR-200c and miR141-mediated lncRNA-mRNA crosstalks differ considerably between the two subtypes and have close relationships with the prognosis of MIBC. The miR-200c and miR-141-dependent mRNA-lncRNA crosstalks might be of great significance in tumorigenesis and tumor progression and may serve as the novel prognostic predictors and classification markers of MIBC subtypes.
Project description:<h4>Background</h4>Although various molecular subtypes of bladder cancer (BC) have been investigated, most of these studies have focused on muscle-invasive BC (MIBC). A few studies have investigated non-muscle-invasive BC (NMIBC) or NMIBC and MIBC together, but none has classified progressive NMIBC or immune checkpoint inhibitor (ICI)-based therapeutic responses in early-stage BC patients.<h4>Methods</h4>A total of 1,934 samples from seven patient cohorts were used. We performed unsupervised hierarchical clustering to stratify patients into distinct subgroups and constructed a classifier by applying SAM/PAM algorithms. We then investigated the association between molecular subtypes and immunotherapy responsiveness using various statistical methods.<h4>Findings</h4>We explored large-scale genomic datasets encompassing NMIBC and MIBC, redefining four distinct molecular subtypes, including a subgroup containing progressive NMIBC and MIBC with poor prognosis that would benefit from ICI treatment. This subgroup showed poor progression-free survival with the distinct features of high mutation load, activated cell cycle, and inhibited TGFβ signalling. Importantly, we verified that BC patients with this subtype were significantly responsive to an anti-PD-L1 agent in the IMvigor210 cohort.<h4>Interpretation</h4>Our results reveal an immunotherapeutic option for ICI treatment of highly progressive NMIBC and MIBC with poor prognosis.<h4>Funding</h4>This research was supported by the National Research Foundation of Korea grant funded by the Korean government, a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute, funded by the Ministry of Health and Welfare, Republic of Korea, and a grant from the KRIBB Research Initiative Program.
Project description:Only a subgroup of patients with muscle-invasive bladder cancer (MIBC) are responders toward cisplatin-based chemotherapy and PD-L1 blockade immunotherapy. There is a clinical need to identify MIBC molecular subtypes and biomarkers for patient stratification toward the therapies. Here, we performed an integrative clustering analysis of 388 MIBC samples with multi-omics data and identified basal and luminal/differentiated integrative subtypes and derived a 42 gene panel for classification of MIBC. Using nine additional gene expression data (n?=?844), we demonstrated the prognostic value of the 42 basal-luminal genes. The basal subtype was associated with worse overall survival in patients receiving no neoadjuvant chemotherapy (NAC), but better overall survival in patients receiving NAC in two clinical trials. Each of the subtypes could be further divided into chr9 p21.3 normal or loss subgroup. The patients with low expression of MTAP/CDKN2A/2B (indicative of chr9 p21.3 loss) had a significantly lower response rate to anti-PD-L1 immunotherapy and worse survival than the patients with high expression of MTAP/CDKN2A/2B. This integrative analysis reveals intrinsic MIBC subtypes and biomarkers with prognostic value for the frontline therapies. Qianxing Mo et al. identify basal and luminal integrative subtypes of muscle-invasive bladder cancer (MIBC) using multi-omics data from The Cancer Genome Atlas. Using a gene panel for classification of MIBC derived from gene expression data, they find that the basal subtype is associated with worse survival in patients receiving no neoadjuvant chemotherapy (NAC), but better survival in patients receiving cisplatin-based NAC and further identify genes associated with response to PD-L1 blockade immunotherapy, suggesting potential clinical use of these genes’ expression signature.
Project description:Glioblastoma (GBM) is a lethal tumor, but few biomarkers and molecular subtypes predicting prognosis are available. This study was aimed to identify prognostic subtypes and multi-omics signatures for GBM. Using oncopression and TCGA-GBM datasets, we identified 80 genes most associated with GBM prognosis using correlations between gene expression levels and overall survival of patients. The prognostic score of each sample was calculated using these genes, followed by assigning three prognostic subtypes. This classification was validated in two independent datasets (REMBRANDT and Severance). Functional annotation revealed that invasion- and cell cycle-related gene sets were enriched in poor and favorable group, respectively. The three GBM subtypes were therefore named invasive (poor), mitotic (favorable), and intermediate. Interestingly, invasive subtype showed increased invasiveness, and MGMT methylation was enriched in mitotic subtype, indicating need for different therapeutic strategies according to prognostic subtypes. For clinical convenience, we also identified genes that best distinguished the invasive and mitotic subtypes. Immunohistochemical staining showed that markedly higher expression of PDPN in invasive subtype and of TMEM100 in mitotic subtype (P?<?0.001). We expect that this transcriptome-based classification, with multi-omics signatures and biomarkers, can improve molecular understanding of GBM, ultimately leading to precise stratification of patients for therapeutic interventions.
Project description:We propose an unsupervised multi-omics integration pipeline, using deep-learning autoencoder algorithm, to predict the survival subtypes in bladder cancer (BC). We used TCGA dataset comprising mRNA, miRNA and methylation to infer two survival subtypes. We then constructed a supervised classification model to predict the survival subgroups of any new individual sample. Our training data gave two subgroups with significant survival differences (p-value=8e-4), where high-risk survival subgroup was enriched with KRT6/14 overexpression and PI3K-Akt pathways. We tested the robustness of model by randomly splitting the main dataset into multiple training and test folds, which gave overall significant p-values. Then, we successfully inferred the subtypes for a subset of samples kept as test dataset (p-value=0.03). We further applied our pipeline to predict the survival subgroups from another validation dataset with miRNA data (p-value=0.02). Conclusively, present pipeline is an effective approach to infer the survival subtype of a new sample, exemplified by BC.
Project description:A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.
Project description:Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.
Project description:<h4>Background</h4> Lung adenocarcinoma (LUAD) is the most frequently diagnosed histological subtype of lung cancer. Our purpose was to explore molecular subtypes and core genes for LUAD using multi-omics analysis. <h4>Methods</h4> Methylation, transcriptome, copy number variation (CNV), mutations and clinical feature information concerning LUAD were retrieved from The Cancer Genome Atlas Database (TCGA). Molecular subtypes were conducted via the “iClusterPlus” package in R, followed by Kaplan-Meier survival analysis. Correlation between iCluster subtypes and immune cells was analyzed. Core genes were screened out by integration of methylation, CNV and gene expression, which were externally validated by independent datasets. <h4>Results</h4> Two iCluster subtypes were conducted for LUAD. Patients in imprinting centre 1 (iC1) subtype had a poorer prognosis than those in iC2 subtype. Furthermore, iC2 subtype had a higher level of B cell infiltration than iC1 subtype. Two core genes including CNTN4 and RFTN1 were screened out, both of which had higher expression levels in iC2 subtype than iC1 subtype. There were distinct differences in CNV and methylation of them between two subtypes. After validation, low expression of CNTN4 and RFTN1 predicted poorer clinical outcomes for LUAD patients. <h4>Conclusion</h4> Our findings comprehensively analyzed genomics, epigenomics, and transcriptomics of LUAD, offering novel underlying molecular mechanisms for LUAD. Two multi-omics-based core genes (CNTN4 and RFTN1) could become potential therapeutic targets for LUAD. <h4>Supplementary Information</h4> The online version contains supplementary material available at 10.1186/s12885-021-07888-4.