A Novel Promoter CpG-Based Signature for Long-Term Survival Prediction of Breast Cancer Patients.
ABSTRACT: DNA methylation has been reported as one of the most critical epigenetic aberrations during the tumorigenesis and development of breast cancer (BC). This study explored a novel promoter CpG-based signature for long-term survival prediction of BC patients. We used The Cancer Genome Atlas (TCGA) data as training set, and results were validated in an independent dataset from Gene Expression Omnibus (GEO). First, the differential methylation CpG sites were screened in TCGA dataset, of which the candidate promoter CpG sites were preliminarily identified with the univariate Cox regression analysis and the least absolute shrinkage and selection operator regression analysis. Second, the signature was constructed with stepwise regression analysis and multivariate Cox proportional hazards model, which was validated with the survival analysis of two cohorts each from TCGA and GEO databases. The 10-year receiver operating characteristic curves of risk score presented an area under the curve of over 0.7 for both cohorts. A nomogram was also constructed and released. Moreover, Gene Set Enrichment Analysis was performed to identify the more active pathways in high-risk patients. The CpG sites-target gene correlations and differential methylation regions were further explored. In conclusion, the promoter CpG-based signature exhibited good prognostic prediction efficacy in the long-term overall survival of BC patients.
Project description:Glioblastoma (GBM) ranks the most common and aggressive primary brain malignant tumor worldwide. However, the survival rates of patients remain very poor. Therefore, molecular oncology of GBM are urgently needed. In this study, we performed an integrative analysis of DNA methylation and gene expression to identify key epigenetic genes in GBM. The methylation and gene expression of GBM patients in The Cancer Genome Atlas (TCGA) database were downloaded. After data preprocessing, we identified 4,881 differentially expressed genes (DEGs) between tumor and normal samples, including 1,111 upregulated and 3,770 downregulated genes. Then, we randomly separated all samples into training set (n = 69) and testing set (n = 69). We next obtained 11,269 survival-methylation sites by univariate and multivariate Cox regression analyses. In the correlation analysis, we defined 198 low promoter methylation with high gene expression as epigenetically induced (EI) genes and 111 high promoter methylation with low gene expression as epigenetically suppressed (ES) genes. Key markers including C1orf61 and FAM50B were selected with a Pearson correlation coefficient greater than 0.75. Further, we chose the 20 CpG methylation sites of above two genes in unsupervised clustering analysis using the Euclidean distance. We found that the prognosis of the hypomethylated group was significantly better than that in the hypermethylated group (log-rank test p-value = 0.011). Based on the validation in the TCGA testing set and GEO dataset, we validated the prognostic value of our signature (p-value = 0.02 in TCGA and 0.012 in GEO). In conclusion, our findings provided predictive and prognostic value as methylation-based biomarkers for the diagnosis and treatment of GBM.
Project description:Background: Aberrant methylation of CpG islands in tumor cells in promoter regions is a critical event in non-small cell lung carcinoma (NSCLC) tumorigenesis and can be a potential diagnostic biomarker for NSCLC patients. The present study systemically and quantitatively reviewed the diagnostic ability of CDH13 methylation in NSCLC as well as in its subsets. Eligible studies were identified through searching PubMed, Web of Science, Cochrane Library and Embase. The pooled odds of CDH13 promoter methylation in lung cancer tissues versus normal controls were calculated by meta-analysis method. Simultaneously, four independent DNA methylation datasets of NSCLC from TCGA and GEO database were downloaded and analyzed to validate the results from meta-analysis. Results: Thirteen studies, including 1850 samples were included in this meta-analysis. The pooled odds ratio of CDH13 promoter methylation in cancer tissues was 7.41 (95% CI: 5.34 to 10.29, P < 0.00001) compared with that in controls under fixed-effect model. In validation stage, 126 paired samples from TCGA were analyzed and 5 out of the 6 CpG sites in the CpG island of CDH13 were significantly hypermethylated in lung adenocarcinoma tissues but none of the 6 CpG sites was hypermethylated in squamous cell carcinoma tissues. Concordantly, the results from other three datasets, which were subsequently obtained from GEO database consisting of 568 tumors and 256 normal tissues, also consisted with those from TCGA dataset. Conclusion: The pooled data showed that the methylation status of the CDH13 promoter is strongly associated with lung adenocarcinoma. The CDH13 methylation status could be a promising diagnostic biomarker for diagnosis of lung adenocarcinoma.
Project description:Aberrant methylation of CpG islands acquired in promoter regions plays an important role in carcinogenesis. Accumulated evidence demonstrates FHIT gene promoter hyper-methylation is involved in non-small cell lung cancer (NSCLC). To test the diagnostic ability of FHIT methylation status on NSCLC, thirteen studies, including 2,119 samples were included in our meta-analysis. Simultaneously, four independent DNA methylation datasets from TCGA and GEO database were analyzed for validation. The pooled odds ratio of FHIT promoter methylation in cancer samples was 3.43 (95% CI: 1.85 to 6.36) compared with that in controls. In subgroup analysis, significant difference of FHIT gene promoter methylation status in NSCLC and controls was found in Asians but not in Caucasian population. In validation stage, 950 Caucasian samples, including 126 paired samples from TCGA, 568 cancer tissues and 256 normal controls from GEO database were analyzed, and all 8 CpG sites near the promoter region of FHIT gene were not significantly differentially methylated. Thus the diagnostic role of FHIT gene in the lung cancer may be relatively limited in the Caucasian population but useful in the Asians.
Project description:Background:There is plenty of evidence showing that immune-related genes (IRGs) and epigenetic modifications play important roles in the biological process of cancer. The purpose of this study is to establish novel IRG prognostic markers by integrating mRNA expression and methylation in lung adenocarcinoma (LUAD). Methods and Results:The transcriptome profiling data and the RNA-seq data of LUAD with the corresponding clinical information of 543 LUAD cases were downloaded from The Cancer Genome Atlas (TCGA) database, which were analyzed by univariate Cox proportional regression and multivariate Cox proportional regression to develop an independent prognostic signature. On the basis of this signature, we could divide LUAD patients into the high-risk, medium-risk, and low-risk groups. Further survival analyses demonstrated that high-risk patients had significantly shorter overall survival (OS) than low-risk patients. The signature, which contains 8 IRGs (S100A16, FGF2, IGKV4-1, CX3CR1, INHA, ANGPTL4, TNFRSF11A, and VIPR1), was also validated by data from the Gene Expression Omnibus (GEO) database. We also conducted analyses of methylation levels of the relevant IRGs and their CpG sites. Meanwhile, their associations with prognosis were examined and validated by the GEO database, revealing that the methylation levels of INHA, S100A16, the CpG site cg23851011, and the CpG site cg06552037 may be used as the potential regulators for the treatment of LUAD. Conclusion:Collectively, INHA, S100A16, the CpG site cg23851011, and the CpG site cg06552037 are promising biomarkers for monitoring the outcomes of LUAD.
Project description:To identify a glycolysis-related gene signature for the evaluation of prognosis in patients with breast cancer, we analyzed the data of a training set from TCGA database and four validation cohorts from the GEO and ICGC databases which included 1,632 patients with breast cancer. We conducted GSEA, univariate Cox regression, LASSO, and multiple Cox regression analysis. Finally, an 11<i>-</i>gene signature related to glycolysis for predicting survival in patients with breast cancer was developed. And Kaplan-Meier analysis and ROC analyses suggested that the signature showed a good prognostic ability for BC in the TCGA, ICGC, and GEO datasets. The analyses of univariate Cox regression and multivariate Cox regression revealed that it's an important prognostic factor independent of multiple clinical features. Moreover, a prognostic nomogram, combining the gene signature and clinical characteristics of patients, was constructed. These findings provide insights into the identification of breast cancer patients with a poor prognosis.
Project description:DNA methylation has started a recent revolution in genomics biology by identifying key biomarkers for multiple cancers, including oral squamous cell carcinoma (OSCC), the most common head and neck squamous cell carcinoma.A multi-stage screening strategy was used to identify DNA-methylation-based signatures for OSCC prognosis. We used The Cancer Genome Atlas (TCGA) data as training set which were validated in two independent datasets from Gene Expression Omnibus (GEO). The correlation between DNA methylation and corresponding gene expression and the prognostic value of the gene expression were explored as well.The seven DNA methylation CpG sites were identified which were significantly associated with OSCC overall survival. Prognostic signature, a weighted linear combination of the seven CpG sites, successfully distinguished the overall survival of OSCC patients and had a moderate predictive ability for survival [training set: hazard ratio (HR) = 3.23, P = 5.52 × 10-10, area under the curve (AUC) = 0.76; validation set 1: HR = 2.79, P = 0.010, AUC = 0.67; validation set 2: HR = 3.69, P = 0.011, AUC = 0.66]. Stratification analysis by human papillomavirus status, clinical stage, age, gender, smoking status, and grade retained statistical significance. Expression of genes corresponding to candidate CpG sites (AJAP1, SHANK2, FOXA2, MT1A, ZNF570, HOXC4, and HOXB4) was also significantly associated with patient's survival. Signature integrating of DNA methylation, gene expression, and clinical information showed a superior ability for prognostic prediction (AUC = 0.78).Prognostic signature integrated of DNA methylation, gene expression, and clinical information provides a better prognostic prediction value for OSCC patients than that with clinical information only.
Project description:Background: Promoter hypermethylation in death-associated protein kinase 1 (DAPK1) gene has been long linked to cervical neoplasia, but the established results remained controversial. Here, we performed a meta-analysis to assess the associations of DAPK1 promoter hypermethylation with low-grade intra-epithelial lesion (HSIL), high-grade intra-epithelial lesion (HSIL), cervical cancer (CC), and clinicopathological features of CC. Methods: Published studies with qualitative methylation data were initially searched from PubMed, Web of Science, EMBASE, and China National Knowledge Infrastructure databases (up to March 2018). Then, quantitative methylation datasets, retrieved from the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases, were pooled to validate the results of published studies. Results: In a meta-analysis of 37 published studies, DAPK1 promoter hypermethylation progressively increased the risk of LSIL by 2.41-fold (P = 0.012), HSIL by 7.62-fold (P < 0.001), and CC by 23.17-fold (P < 0.001). Summary receiver operating characteristic curves suggested a potential diagnostic value of DAPK1 promoter hypermethylation in CC, with a large area-under-the-curve of 0.83, a high specificity of 97%, and a moderate sensitivity of 59%. There were significant impacts of DAPK1 promoter hypermethylation on histological type (odds ratio (OR) = 3.53, P < 0.001) and FIGO stage of CC (OR = 2.15, P = 0.003). Then, a pooled analysis of nine TCGA and GEO datasets, covering 13 CPG sites within DAPK1 promoter, identified eight CC-associated sites, six sites with diagnostic values for CC (pooled specificities: 74-90%; pooled sensitivities: 70-81%), nine loci associated with the histological type of CC, and all 13 loci with down-regulated effects on DAPK1 mRNA expression. Conclusion: The meta-analysis suggests that DAPK1 promoter hypermethylation is significantly associated with the disease severity of cervical neoplasia. DAPK1 methylation detection exhibits a promising ability to discriminate CC from cancer-free controls.
Project description:BACKGROUND:Prostate cancer (PC) is a commonly diagnosed malignancy in males, especially in the western hemisphere. The extensive use of multiple biomarkers plays an important role in the diagnosis and prognosis of PC. However, the accuracy of biomarkers for PC prognosis needs to be urgently improved. This study aimed to identify a novel prognostic biomarker for PC. MATERIALS AND METHODS:Differentially methylated CpG sites were identified from the GSE76938 dataset ( https://www.ncbi.nlm.nih.gov/geo/ ) using R software version 3.1.4. Four significant CpG sites on the SLCO4C1 gene were found to be closely associated with prognosis in PC. Data downloaded from The Cancer Genome Atlas (TCGA) were used for validation. Co-expression and functional enrichment analyses were used to explore the roles of SLCO4C1 in molecular functions, biological processes and cellular components. Total RNA extraction and qRT-PCR were used to reveal the difference in SLCO4C1 expression between tumour and normal tissues. Bisulfite amplicon sequencing (BSAS) was used to identify methylation levels at the CpG sites. RESULTS:In the GSE76938 cohort, 10,206 CpG sites were identified to be differentially methylated in tumour versus normal prostate tissues. Among the CpG sites, four sites (cg06480736, cg19774478, cg19788741 and cg22149516) located in the promotor region (TSS200-1500) of SLCO4C1 were found to be significantly hypermethylated in tumour tissues. The results were validated in an independent dataset (TCGA PRAD cohort). In the cohort from TCGA, SLCO4C1 expression was negatively correlated with methylation levels at the four sites. The results of qRT-PCR validated that tumour tissues had a relatively lower expression of SLCO4C1. Bisulfite amplicon sequencing (BSAS) further confirmed a higher methylation level at the SLCO4C1 promoter in tumour tissues. SLCO4C1 (cg06480736, cg19774478, cg19788741 and cg22149516) was identified as a significant promising biomarker for biochemical recurrence-free survival in Kaplan-Meier analysis (P < 0.01) and univariate Cox proportional hazards analysis: cg06480736 (HR 15.914, P < 0.001), cg19774478 (HR 9.001, P < 0.001), cg19788741 (HR 10.759, P = 0.003) and cg22149516 (HR 17.144, P = 0.006). However, three sites, namely, cg06480736 (HR 1.809, P = 0.049), cg19774478 (HR 1.903, P = 0.041) and cg22149516 (HR 2.316, P = 0.008), were confirmed in multivariate analysis. CONCLUSIONS:SLCO4C1 promoter methylation, including that at three CpG sites, namely, cg06480736, cg19774478 and cg22149516, is a potential biomarker for risk stratification and might offer significantly relevant prognostic information for PC patients after radical prostatectomy.
Project description:Background: Abnormal epigenetic alterations can contribute to the development of human malignancies. Identification of these alterations for early screening and prognosis of clear cell renal cell carcinoma (ccRCC) has been a highly sought-after goal. Bioinformatic analysis of DNA methylation data provides broad prospects for discovery of epigenetic biomarkers. However, there is short of exploration of methylation-driven genes of ccRCC. Methods: Gene expression data and DNA methylation data in metastatic ccRCC were sourced from the Gene Expression Omnibus (GEO) database. Differentially methylated genes (DMGs) at 5'-C-phosphate-G- 3' (CpG) sites and differentially expressed genes (DEGs) were screened and the overlapping genes in DMGs and DEGs were then subject to gene set enrichment analysis. Next, the weighted gene co-expression network analysis (WGCNA) was used to search hub DMGs associated with ccRCC. Cox regression and ROC analyses were performed to screen potential biomarkers and develop a prognostic model based on the screened hub genes. Results: Three hundred and fourteen overlapping DMGs were obtained from two independent GEO datasets. The turquoise module contained 79 hub DMGs, which represent the most significant module screened by WGCNA. Furthermore, a total of 12 hub genes (CETN3, DCAF7, GPX4, HNRNPA0, NUP54, SERPINB1, STARD5, TRIM52, C4orf3, C12orf51, and C17orf65) were identified in the TCGA database by multivariate Cox regression analyses. All the 12 genes were then used to generate the model for diagnosis and prognosis of ccRCC. ROC analysis showed that these genes exhibited good diagnostic efficiency for metastatic and non-metastatic ccRCC. Furthermore, the prognostic model with the 12 methylation-driven genes demonstrated a good prediction of 5-year survival rates for ccRCC patients. Conclusion: Integrative analysis of DNA methylation data identified 12 signature genes, which could be used as epigenetic biomarkers for prognosis of metastatic ccRCC. This prognostic model has a good prediction of 5-year survival for ccRCC patients.
Project description:Background:DNA methylation is a form of epigenetic modification that has been shown to play a significant role in gene regulation. In cancer, DNA methylation plays an important role by regulating the expression of oncogenes. The role of DNA methylation in the onset and progression of various cancer types is now being elucidated as more large-scale data become available. The Cancer Genome Atlas (TCGA) provides a wealth of information for the analysis of various molecular aspects of cancer genetics. Gene expression data and DNA methylation data from TCGA have been used for a variety of studies. A traditional understanding of the effects of DNA methylation on gene expression has linked methylation of CpG sites in the gene promoter region with the decrease in gene expression. Recent studies have begun to expand this traditional role of DNA methylation. Results:Here we present a pan-cancer analysis of correlation patterns between CpG methylation and gene expression. Using matching patient data from TCGA, 33 cancer-specific correlations were calculated for each CpG site and the expression level of its corresponding gene. These correlations were used to identify patterns on a per-site basis as well as patterns of methylation across the gene body. Using these identified patterns, we found genes that contain conflicting methylation signals beyond the commonly accepted association between the promoter region methylation and silencing of gene expression. Beyond gene body methylation in whole, we examined individual CpG sites and show that, even in the same gene body, some sites can have a contradictory effect on gene expression in cancers. Conclusions:We observed that within promoter regions there was a substantial amount of positive correlation between methylation and gene expression, which contradicts the commonly accepted association. We observed that the correlation between CpG methylation and gene expression does not exhibit in a tissue-specific manner, suggesting that the effects of methylation on gene expression are largely tissue independent. The analysis of correlation associated with the location of the CpG site in the gene body has led to the identification of several different methylation patterns that affect gene expression, and several examples of methylation activating gene expression were observed. Distinctly opposing or conflicting effects were seen in close proximity on the gene body, where negative and positive correlations were seen at the neighboring CpG sites.