Establishment of an immune-related gene pair model to predict colon adenocarcinoma prognosis.
ABSTRACT: BACKGROUND:Colon cancer is the most common type of gastrointestinal cancer and has high morbidity and mortality. Colon adenocarcinoma (COAD) is the main pathological type of colon cancer, and much evidence has supported the correlation between the prognosis of COAD and the immune system. The current study aimed to develop a robust prognostic immune-related gene pair (IRGP) model to estimate the overall survival of patients with COAD. METHODS:The gene expression profiles and clinical information of patients with colon adenocarcinoma were obtained from the TCGA and GEO databases and were divided into training and validation cohorts. Immune genes were selected that showed a significant association with prognosis. RESULTS:Among 1647 immune genes, a model with 17 IRGPs was built that was significantly associated with OS in the training cohort. In the training and validation datasets, the IRGP model divided patients into the high-risk group and low-risk group, and the prognosis of the high-risk group was significantly worse (P<0.001). Univariate and multivariate Cox proportional hazard analyses confirmed the feasibility of this model. Functional analysis confirmed that multiple tumor progression and stem cell growth-related pathways were upregulated in the high-risk groups. Regulatory T cells and macrophages M0 were significantly highly expressed in the high-risk group. CONCLUSION:We successfully constructed an IRGP model that can predict the prognosis of COAD, providing new insights into the treatment strategy of COAD.
Project description:<h4>Background</h4>Immune-related genes is closely related to the occurrence and prognosis of head and neck squamous cell carcinoma (HNSCC). At the same time, immune-related genes have great potential as prognostic markers in many types of cancer. The prognosis of HNSCC is still poor currently, and it may be effective to predict the clinical outcome of HNSCC by immunogenic analysis.<h4>Methods</h4>RNASeq and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), the MINiML format GSE65858 chip expression data was downloaded from NCBI, and immune-related genes was downloaded from the InnateDB database. Immune-related genes in 519 HNSC patients were integrated from TCGA dataset. By using multivariate COX analysis and Lasso regression, robust immune-related gene pairs (IRGPs) that predict clinical outcomes of HNSCC were identified. Finally, a risk prognostic model related to immune gene pair was established and verified by clinical features, test sets and GEO external validation set.<h4>Results</h4>A total of 699 IRGPs were significantly correlated with the prognosis of HNSCC patients. Fourteen robust IRGPs were finally obtained by Lasso regression and a prognostic risk prediction model was constructed. Risk score of each sample were calculated based on Risk models and divided into the high-risk group (Risk-H) and low Risk group (Risk-L). Risk models were able to stratify the risk in patients with TNM Stage, Age, gender, and smoking history, and the AUC > 0.65 in training set and test set, shows that 14-IRGPs signature in patients with HNSCC has excellent classification performance. In addition, 14-IRGPs had the highest average C index compared with the prognostic characteristics and T, N, and Age of the 3 previously reported HNSCC.<h4>Conclusion</h4>This study constructed 14-IRGPs as a novel prognostic marker for predicting survival in HNSCC patients.
Project description:Although the outcome of colorectal cancer (CRC) patients has improved significantly with the recent implementation of annual screening programs, reliable prognostic biomarkers are still needed due to the disease heterogeneity. Increasing pieces of evidence revealed an association between immune signature and CRC prognosis. Thus, we aim to build a robust immune-related gene pairs (IRGPs) signature that can estimate prognosis for CRC. Gene expression profiles and clinical information of CRC patients were collected from six public cohorts, divided into training cohort (n = 565) and five independent validation cohorts (n = 572, 290, 90 177 and 68, respectively). Within 1534 immune genes, a 19 IRGPs signature consisting of 36 unique genes was constructed which was significantly associated with the survival. In the validation cohorts, the IRGPs signature significantly stratified patients into high- vs low-risk groups in terms of prognosis across and within subpopulations with early stages disease and was prognostic in univariate and multivariate analyses. Several biological processes, including response to bacterium, were enriched among genes in the IRGPs signature. Macrophage M2 and mast cells were significantly higher in the high-risk risk group compared with the low-risk group. The IRGPs signature achieved a higher accuracy than commercialized multigene signatures for estimation of survival. When integrated with clinical factors such as sex and stage, the composite clinical and IRGPs signature showed improved prognostic accuracy relative to IRGPs signatures alone. In short, we developed a robust IRGPs signature for estimating prognosis in CRC, including early-stage disease, providing new insights into the identification of CRC patients with a high risk of mortality.
Project description:Background:Colon adenocarcinoma (COAD) is the most common colon cancer exhibiting high mortality. Due to their association with cancer progression, long noncoding RNAs (lncRNAs) are now being used as prognostic biomarkers. In the present study, we used relevant clinical information and expression profiles of lncRNAs originating from The Cancer Genome Atlas database, aiming to construct a prognostic lncRNA signature to estimate the prognosis of patients. Methods:The samples were randomly spilt into training and validation cohorts. In the training cohort, prognosis-related lncRNAs were selected from differentially expressed lncRNAs using the univariate Cox analysis. Furthermore, the least absolute shrinkage and selection operator (LASSO) regression and multivariate Cox analysis were employed for identifying prognostic lncRNAs. The prognostic signature was constructed by these lncRNAs. Results:The prognostic model was able to calculate each COAD patient's risk score and split the patients into groups of low and high risks. Compared to the low-risk group, the high-risk group had significant poor prognosis. Next, the prognostic signature was validated in the validation, as well as all cohorts. The receiver operating characteristic (ROC) curve and c-index were determined in all cohorts. Moreover, these prognostic lncRNA signatures were combined with clinicopathological risk factors to construct a nomogram for predicting the prognosis of COAD in the clinic. Finally, seven lncRNAs (CTC-273B12.10, AC009404.2, AC073283.7, RP11-167H9.4, AC007879.7, RP4-816N1.7, and RP11-400N13.2) were identified and validated by different cohorts. The Kyoto Encyclopedia of Genes and Genomes analysis of the mRNAs co-expressed with the seven prognostic lncRNAs suggested four significantly upregulated pathways, which were AGE-RAGE, focal adhesion, ECM-receptor interaction, and PI3K/Akt signaling pathways. Conclusion:Thus, our study verified that the seven lncRNAs mentioned can be used as biomarkers to predict the prognosis of COAD patients and design personalized treatments.
Project description:OBJECTIVE:Melanoma is rare but dangerous skin cancer, and it can spread rather quickly in the advanced stages of the tumor. Abundant evidence suggests the relationship between tumor development and progression and the immune system. A robust gene risk model could provide an accurate prediction of clinical outcomes. The present study aimed to explore a robust signature of immune-related gene pairs (IRGPs) for estimating overall survival (OS) in malignant melanoma. METHODS:Clinical and genetic data of skin cutaneous melanoma (SKCM) patients from The Cancer Genome Atlas (TCGA) was performed as a training dataset to identify candidate IRGPs for the prognosis of melanoma. Two independent datasets from the Gene Expression Omnibus (GEO) database (GSE65904) and TCGA dataset (TCGA-UVM) were selected for external validation. Univariate and multivariate Cox regression analyses were then performed to explore the prognostic power of the IRGPs signature and other clinical factors. CIBERSORTx was applied to estimate the fractions of infiltrated immune cells in bulk tumor tissues. RESULTS:A signature consisted of 33 IRGPs was established which was significantly associated with patients' survival in the TCGA-SKCM dataset (P = 2.0×10-16, Hazard Ratio (HR) = 4.220 (2.909 to 6.122)). We found the IRGPs signature exhibited an independent prognostic factor in all the three independent cohorts in both the univariate and multivariate Cox analysis (P<0.01). The prognostic efficacy of the signature remained unaffected regardless of whether BRAF or NRAS was mutated. As expected, the results were verified in the GSE65904 dataset and the TCGA-UVM dataset. We found an apparent shorter OS in patients of the high-risk group in the GSE65904 dataset (P = 2.1×10-3; HR = 1.988 (1.309 to 3.020)). The trend in the results of the survival analysis in TCGA-UVM was as we expected, but the result was not statistically significant (P = 0.117, HR = 4.263 (1.407 to 12.91)). CD8 T cells, activated dendritic cells (DCs), regulatory T cells (Tregs), and activated CD4 memory T cells presented a significantly lower fraction in the high-risk group in the TCGA-SKCM dataset(P <0.01). CONCLUSION:The results of the present study support the IRGPs signature as a promising marker for prognosis prediction in melanoma.
Project description:BACKGROUND:Colon adenocarcinoma (COAD) is a gastrointestinal tumor with a high degree of malignancy. Its deterioration process is closely related to the tumor microenvironment, and transcription factors (TF) play a regulatory role in this process. Currently, there is a lack of exploration between the genes related to the COAD tumor microenvironment and the survival prognosis of patients. Models composed of multiple genes usually predict the survival prognosis of patients more accurately than single genes. We can analyze the multigene models that can predict the prognosis of COAD from the current database. METHODS:The limma package of the R programming language is used for gene differential expression analysis. Kaplan-Meier curve is used to analyze the relationship between the patient risk score model and survival data. The hazard model is used to analyze the relationship between the risk score and the clinical data of COAD patients. The information of immune genes and immune cells is obtained from IMMPORT database and TIMER database. Receiver operating characteristic (ROC) curve is used to judge the stability of the model. RESULTS:We found 7 immune genes, which can built a risk score model to predict the survival prognosis of COAD. According to univariate and multivariate analysis, the risk score can be used as an independent predictor. The content of some immune microenvironment cells will also increase as the risk score increases. CONCLUSIONS:We found 7 immune genes, such as SLC10A2 (solute carrier family 10 member 2), CXCL3 (C-X-C motif chemokine ligand 3), IGHV5-51 (immunoglobulin heavy variable 5-51), INHBA (inhibin subunit beta A), STC1 (stanniocalcin 1), UCN (urocortin), and OXTR (oxytocin receptor), can constitute a model for predicting the prognosis of COAD. They may provide potential therapeutic targets for clinical treatment of COAD.
Project description:<h4>Background</h4>Colon adenocarcinoma (COAD) patients who develop recurrence have poor prognosis. Our study aimed to establish effective prognosis prediction model based on competing endogenous RNAs (ceRNAs) for recurrence of COAD.<h4>Methods</h4>COAD expression profilings downloaded from The Cancer Genome Atlas (TCGA) were used as training dataset, and expression profilings of GSE29623 retrieved from Gene Expression Omnibus (GEO) were set as validation dataset. Differentially expressed RNAs (DERs) between non-recurrent and recurrent specimens in training dataset were screened, and optimum prognostic signature DERs were revealed to establish prognostic score (PS) model. Kaplan-Meier survival analysis was conducted for PS model, and GEO dataset was used for validation. Prognosis prediction efficiencies were evaluated by area under curve (AUC) and C-index. Meanwhile, ceRNA regulatory network was constructed by using signature mRNAs, lncRNAs and miRNAs.<h4>Results</h4>We identified 562 DERs including 42 lncRNAs, 36 miRNAs, and 484 mRNAs. PS prediction model, consisting of 17 optimum prognostic signature DERs, showed that high risk group had significantly poorer prognosis (5-year AUC?=?0.951, C-index?=?0.788), which also validated in GSE29623. Prognosis prediction model incorporating multi-RNAs with pathologic distant metastasis (M) and pathologic primary tumor (T) (5-year AUC?=?0.969, C-index?=?0.812) had better efficiency than clinical prognosis prediction model (5-year AUC?=?0.712, C-index?=?0.680). In the constructed ceRNA regulatory network, lncRNA NCBP2-AS1 could interact with hsa-miR-34c and hsa-miR-363, and lncRNA LINC00115 could interact with hsa-miR-363 and hsa-miR-4709. SIX4, GRAP, NKAIN4, MMAA, and ERVMER34-1 are regulated by hsa-miR-4709.<h4>Conclusion</h4>Prognosis prediction model incorporating multi-RNAs with pathologic M and pathologic T may have great value in COAD prognosis prediction.
Project description:Colon adenocarcinoma (COAD) is a common type of colon cancer, and post-operative recurrence and metastasis may occur in COAD patients. This study is designed to build a risk score system for COAD patients. The Cancer Genome Atlas (TCGA) dataset of COAD (the training set) was downloaded, and GSE17538 and GSE39582 (the validation sets) from Gene Expression Omnibus database were obtained. The differentially expressed RNAs (DERs) were analyzed by limma package. Using survival package, the independent prognosis-associated long non-coding RNAs (lncRNAs) were selected for constructing risk score system. After the independent clinical prognostic factors were screened out using survival package, a nomogram survival model was constructed using rms package. Furthermore, competitive endogenous RNA (ceRNA) regulatory network and enrichment analyses separately were performed using Cytoscape software and DAVID tool. Totally 404 DERs between recurrence and non-recurrence groups were identified. Based on the six independent prognosis-associated lncRNAs (including H19, KCNJ2-AS1, LINC00899, LINC01503, PRKAG2-AS1, and SRRM2-AS1), the risk score system was constructed. After the independent clinical prognostic factors (Pathologic M, pathologic T, and RS model status) were identified, the nomogram survival model was built. In the ceRNA regulatory network, there were three lncRNAs, four miRNAs, and 77 mRNAs. Additionally, PPAR signaling pathway and hedgehog signaling pathway were enriched for the mRNAs in the ceRNA regulatory network. The risk score system and the nomogram survival model might be used for predicting COAD recurrence. Besides, PPAR signaling pathway and hedgehog signaling pathway might affect the recurrence of COAD patients.
Project description:A robust and accurate gene expression signature is essential to assist oncologists to determine which subset of patients at similar Tumor-Lymph Node-Metastasis (TNM) stage has high recurrence risk and could benefit from adjuvant therapies. Here we applied a two-step supervised machine-learning method and established a 12-gene expression signature to precisely predict colon adenocarcinoma (COAD) prognosis by using COAD RNA-seq transcriptome data from The Cancer Genome Atlas (TCGA). The predictive performance of the 12-gene signature was validated with two independent gene expression microarray datasets: GSE39582 includes 566 COAD cases for the development of six molecular subtypes with distinct clinical, molecular and survival characteristics; GSE17538 is a dataset containing 232 colon cancer patients for the generation of a metastasis gene expression profile to predict recurrence and death in COAD patients. The signature could effectively separate the poor prognosis patients from good prognosis group (disease specific survival (DSS): Kaplan Meier (KM) Log Rank p = 0.0034; overall survival (OS): KM Log Rank p = 0.0336) in GSE17538. For patients with proficient mismatch repair system (pMMR) in GSE39582, the signature could also effectively distinguish high risk group from low risk group (OS: KM Log Rank p = 0.005; Relapse free survival (RFS): KM Log Rank p = 0.022). Interestingly, advanced stage patients were significantly enriched in high 12-gene score group (Fisher's exact test p = 0.0003). After stage stratification, the signature could still distinguish poor prognosis patients in GSE17538 from good prognosis within stage II (Log Rank p = 0.01) and stage II & III (Log Rank p = 0.017) in the outcome of DFS. Within stage III or II/III pMMR patients treated with Adjuvant Chemotherapies (ACT) and patients with higher 12-gene score showed poorer prognosis (III, OS: KM Log Rank p = 0.046; III & II, OS: KM Log Rank p = 0.041). Among stage II/III pMMR patients with lower 12-gene scores in GSE39582, the subgroup receiving ACT showed significantly longer OS time compared with those who received no ACT (Log Rank p = 0.021), while there is no obvious difference between counterparts among patients with higher 12-gene scores (Log Rank p = 0.12). Besides COAD, our 12-gene signature is multifunctional in several other cancer types including kidney cancer, lung cancer, uveal and skin melanoma, brain cancer, and pancreatic cancer. Functional classification showed that seven of the twelve genes are involved in immune system function and regulation, so our 12-gene signature could potentially be used to guide decisions about adjuvant therapy for patients with stage II/III and pMMR COAD.
Project description:Background:Colon cancer is one of the most common health threats for humans since its high morbidity and mortality. Detecting potential prognosis risk biomarkers (PRBs) is essential for the improvement of therapeutic strategies and drug development. Currently, although an integrated prognostic analysis of multi-omics for colon cancer is insufficient, it has been reported to be valuable for improving PRBs' detection in other cancer types. Aim:This study aims to detect potential PRBs for colon adenocarcinoma (COAD) samples through the cancer genome atlas (TCGA) by integrating muti-omics. Materials and Methods:The multi-omics-based prognostic analysis (MPA) model was first constructed to systemically analyze the prognosis of colon cancer based on four-omics data of gene expression, exon expression, DNA methylation and somatic mutations on COAD samples. Then, the essential features related to prognosis were functionally annotated through protein-protein interaction (PPI) network and cancer-related pathways. Moreover, the significance of those essential prognostic features were further confirmed by the target regulation simulation (TRS) model. Finally, an independent testing dataset, as well as the single cell-based expression dataset were utilized to validate the generality and repeatability of PRBs detected in this study. Results:By integrating the result of MPA modeling, as well the PPI network, integrated pathway and TRS modeling, essential features with gene symbols such as EPB41, PSMA1, FGFR3, MRAS, LEP, C7orf46, LOC285000, LBP, ZNF35, SLC30A3, LECT2, RNF7, and DYNC1I1 were identified as PRBs which provide high potential as drug targets for COAD treatment. Validation on the independent testing dataset demonstrated that these PRBs could be applied to distinguish the prognosis of COAD patients. Moreover, the prognosis of patients with different clinical conditions could also be distinguished by the above PRBs. Conclusions:The MPA and TRS models constructed in this paper, as well as the PPI network and integrated pathway analysis, could not only help detect PRBs as potential therapeutic targets for COAD patients but also make it a paradigm for the prognostic analysis of other cancers.
Project description:In this study, we collected genes related to energy metabolism, used gene expression data from public databases to classify molecular subtypes of colon cancer (COAD) based on the genes related to energy metabolism, and further evaluated the relationships between the molecular subtypes and prognosis and clinical characteristics. Differential expression analysis of the molecular subtypes yielded 1948 differentially expressed genes (DEGs), whose functions were closely related to the occurrence and development of cancer. Based on the DEGs, we constructed a 4-gene prognostic risk model and identified the high expression of FOXD4, ENPEP, HOXC6, and ALOX15B as a risk factor associated with a high risk of developing COAD. The 4-gene signature has strong robustness and a stable predictive performance in datasets from different platforms not only in patients with early COAD but also in all patients with colon cancer. The enriched pathways of the 4-gene signature in the high- and low-risk groups obtained by GSEA were significantly related to the occurrence and development of colon cancer. Moreover, the results of qPCR, immunohistochemistry staining and Western blot assay revealed that FOXD4, ENPEP, HOXC6, and ALOX15B are over expressed in CRC tissues and cells. These results suggesting that the signature could potentially be used as a prognostic marker for clinical diagnosis.