Identification of Critical Genes and Five Prognostic Biomarkers Associated with Colorectal Cancer.
ABSTRACT: BACKGROUND Colorectal cancer (CRC) is a common malignant tumor with high incidence and mortality worldwide. The aim of this study was to evaluate the association between differentially expressed genes (DEGs), which may function as biomarkers for CRC prognosis and therapies, and the clinical outcome in patients with CRC. MATERIAL AND METHODS A total of 116 normal mucous tissue and 930 CRC tissue datasets were downloaded from the Gene Expression Omnibus database (GEO) and The Cancer Genome Atlas (TCGA). After screening DEGs based on limma package in R. Gene Ontology (GO) and KEGG enrichment analysis as well as the protein-protein interaction (PPI) networks were performed to predict the function of these DEGs. Meanwhile, Cox proportional hazards regression was used to build a prognostic model of these DEGs. Then, Kaplan-Meier risk analysis was used to test the model in TCGA datasets and validation datasets. RESULTS In the present study, 300 DEGs with 100 upregulated genes and 200 downregulated genes were identified. The PPI networks including 162 DEGs and 256 nodes were constructed and 2 modules with high degree were selected. Moreover, 5 genes (MMP1, ACSL6, SMPD1, PPARGC1A, and HEPACAM2) were identified using the Cox proportional hazards stepwise regression. Kaplan-Meier risk curve in the TCGA and validation cohorts showed that high-risk group had significantly poor overall survival than the low-risk group. CONCLUSIONS Our study provided insights into the mechanisms of CRC formation and found 5 prognostic genes, which could potentially inform further studies and clinical therapies.
Project description:Background:Colorectal cancer (CRC) is a common human malignancy. The aims of this study are to investigate the gene expression profile of CRC and to explore potential strategy for CRC diagnosis, therapy and prognosis. Methods:We use affy and Limma package of Bioconductor R to do differential expression genes (DEGs) and differential expression lncRNAs (DELs) analysis from the gene datasets (GSE8671, GSE21510, GSE32323, GSE39582 and TCGA) respectively. Then, DEGs were analyzed by GO and KEGG pathway and Kaplan-Meier survival curve and Cox regression analyses were used to find aberrantly expressed genes associated with survival outcome of CRC patients. Real-time PCR assay was used to verify the aberrantly expressed genes expression in CRC samples. Results:306 up-regulation and 213 down-regulation common DEGs were found. A total of 485 DELs were identified, of which 241 up-regulated and 244 down-regulated. Then, GO and KEGG pathway analyses showed that DEGs were involved in cell cycle, mineral absorption, DNA replication, and Nitrogen metabolism. Among them, Kaplan-Meier survival curve and Cox regression analyses revealed that CDC6, CDC45, ORC6 and SNHG7 levels were significantly associated with survival outcome of CRC patients. Finally, real-time PCR assay was used to verify that the CDC6, CDC45, ORC6 and SNHG7 expression were up-regulated in 198 CRC samples compared with the expression levels in individual-matched adjacent mucosa samples. Conclusion:CDC6, CDC45, ORC6 and SNHG7 are implicated in CRC initiation and progression and could be explored as potential diagnosis, therapy and prognosis targets for CRC.
Project description:Colorectal cancer (CRC) is a life-threatening disease with a poor prognosis. Therefore, it is crucial to identify molecular prognostic biomarkers for CRC. The present study aimed to identify potential key genes that could be used to predict the prognosis of patients with CRC. Three CRC microarray datasets (GSE20916, GSE73360 and GSE44861) were downloaded from the Gene Expression Omnibus (GEO) database, and one dataset was obtained from The Cancer Genome Atlas (TCGA) database. The three GEO datasets were analyzed to detect differentially expressed genes (DEGs) using the BRB-ArrayTools software. Functional and pathway enrichment analyses of these DEGs were performed using the Database for Annotation, Visualization and Integrated Discovery tool. A protein-protein interaction (PPI) network of DEGs was constructed, hub genes were extracted, and modules of the PPI network were analyzed. To investigate the prognostic values of the hub genes in CRC, data from the CRC datasets of TCGA were used to perform the survival analyses based on the sample splitting method and Cox regression model. Correlation among the hub genes was evaluated using Spearman's correlation analysis. In the three GEO datasets, a total of 105 common DEGs were identified, including 51 down- and 54 up-regulated genes in CRC compared with normal colorectal tissues. A PPI network consisting of 100 DEGs and 551 edges was constructed, and 44 nodes were identified as hub genes. Among these 44 genes, the four hub genes TIMP metallopeptidase inhibitor 1 (TIMP1), solute carrier family 4 member 4 (SLC4A4), aldo-keto reductase family 1 member B10 (AKR1B10) and ATP binding cassette subfamily E member 1 (ABCE1) were associated with overall survival (OS) in patients with CRC. Three significant modules were extracted from the PPI network. The hub gene TIMP1 was present in Module 1, ABCE1 was involved in Module 2 and SLC4A4 was identified in Module 3. Univariate analysis revealed that TIMP1, SLC4A4, AKR1B10 and ABCE1 were associated with the OS of patients with CRC. Multivariate analysis demonstrated that SLC4A4 may be an independent prognostic factor associated with OS. Furthermore, the results from correlation analysis revealed that there was no correlation between TIMP1, SLC4A4 and ABCE1, whereas AKR1B10 was positively correlated with SLC4A4. In conclusion, the four key genes TIMP1, SLC4A4, AKR1B10 and ABCE1 associated with the OS of patients with CRC were identified by integrated bioinformatics analysis. These key genes may be used as prognostic biomarkers to predict the survival of patients with CRC, and may therefore represent novel therapeutic targets for CRC.
Project description:<h4>Background</h4>Lung adenocarcinoma (LUAD) is the most frequent histological type of lung cancer, and its incidence has displayed an upward trend in recent years. Nevertheless, little is known regarding effective biomarkers for LUAD.<h4>Methods</h4>The robust rank aggregation method was used to mine differentially expressed genes (DEGs) from the gene expression omnibus (GEO) datasets. The Search Tool for the Retrieval of Interacting Genes (STRING) database was used to extract hub genes from the protein-protein interaction (PPI) network. The expression of the hub genes was validated using expression profiles from TCGA and Oncomine databases and was verified by real-time quantitative PCR (qRT-PCR). The module and survival analyses of the hub genes were determined using Cytoscape and Kaplan-Meier curves. The function of KIF4A as a hub gene was investigated in LUAD cell lines.<h4>Results</h4>The PPI analysis identified seven DEGs including BIRC5, DLGAP5, CENPF, KIF4A, TOP2A, AURKA, and CCNA2, which were significantly upregulated in Oncomine and TCGA LUAD datasets, and were verified by qRT-PCR in our clinical samples. We determined the overall and disease-free survival analysis of the seven hub genes using GEPIA. We further found that CENPF, DLGAP5, and KIF4A expressions were positively correlated with clinical stage. In LUAD cell lines, proliferation and migration were inhibited and apoptosis was promoted by knocking down KIF4A expression.<h4>Conclusion</h4>We have identified new DEGs and functional pathways involved in LUAD. KIF4A, as a hub gene, promoted the progression of LUAD and might represent a potential therapeutic target for molecular cancer therapy.
Project description:Hepatocellular carcinoma (HCC) is a heterogeneous malignancy, which is a major cause of cancer morbidity and mortality worldwide. Thus, the aim of the present study was to identify the hub genes and underlying pathways of HCC via bioinformatics analyses. The present study screened three datasets, including GSE112790, GSE84402 and GSE74656 from the Gene Expression Omnibus (GEO) database, and downloaded the RNA-sequencing of HCC from The Cancer Genome Atlas (TCGA) database. The differentially expressed genes (DEGs) in both the GEO and TCGA datasets were filtered, and the screened DEGs were subsequently analyzed for functional enrichment pathways. A protein-protein interaction (PPI) network was constructed, and hub genes were further screened to create the Kaplan-Meier curve using cBioPortal. The expression levels of hub genes were then validated in different datasets using the Oncomine database. In addition, associations between expression and tumor grade, hepatitis virus infection status, satellites and vascular invasion were assessed. A total of 126 DEGs were identified, containing 70 upregulated genes and 56 downregulated genes from the GEO and TCGA databases. By constructing the PPI network, the present study identified hub genes, including cyclin B1 (CCNB1), cell-division cycle protein 20 (CDC20), cyclin-dependent kinase 1, BUB1 mitotic checkpoint serine/threonine kinase ? (BUB1B), cyclin A2, nucleolar and spindle associated protein 1, ubiquitin-conjugating enzyme E2 C (UBE2C) and ZW10 interactor. Furthermore, upregulated CCNB1, CDC20, BUB1B and UBE2C expression levels indicated worse disease-free and overall survival. Moreover, a meta-analysis of tumor and healthy tissues in the Oncomine database demonstrated that BUB1B and UBE2C were highly expressed in HCC. The present study also analyzed the data of HCC in TCGA database using univariate and multivariate Cox analyses, and demonstrated that BUB1B and UBE2C may be used as independent prognostic factors. In conclusion, the present study identified several genes and the signaling pathways that were associated with tumorigenesis using bioinformatics analyses, which could be potential targets for the diagnosis and treatment of HCC.
Project description:Aim: To identify potential key candidate genes, whose expression and clinical significance was further assessed in colorectal cancer (CRC). Methods: Three original microarray datasets (GSE41328, GSE22598, and GSE23878) from NCBI-GEO were used to analyze differentially expressed genes (DEGs) in CRC. Online database analyses through Oncomine and GEIPA were performed to evaluate SLC4A4 expression and explore the prognostic merit of SLC4A4 expression, which was further confirmed by analyses from QPCR based cDNA array and IHC based tissue microarray (TMA). STRING website was used to explore the interaction between SLC4A4 with other DEGs based on the protein-protein interaction (PPI) networks. Results: Analysis of three original microarray datasets from GEO identified 82 shared, differentially expressed genes (28 upregulated and 54 down-regulated) in CRC tissues. Online analyses from Oncomine and GEIPA revealed lower SLC4A4 mRNA expression in CRC tissues compared to adjacent normal tissues, which were further confirmed by QPCR based cDNA array and IHC based TMA analyses on both mRNA and protein levels. Survival analyses through GEIPA and from TMA demonstrated that low SLC4A4 expression is correlated with worse overall survival among patients with CRC. Survival analysis from Kaplan-meier plotter demonstrated that low SLC4A4 expression is significantly associated with poor progression (including relapse-free survival, overall survival, distant metastasis-free survival, post-progression survival) of patients with breast cancer, lung cancer, gastric cancer, and ovarian cancer. PPI analysis found that SLC4A4 is highly correlated with various genes, including SLC9A3, SLC26A6, ENSG00000214921, SLC26A4, SLC9A3R1, and SLC9A1. Conclusion: The mRNA and protein levels of SLC4A4 were decreased in CRC tissues, and low expression of SLC4A4 significantly correlated with shorter survival of CRC patients and poorer progression of patients with breast cancer, lung cancer, gastric cancer and ovarian cancer, suggesting potential role of SLC4A4 on tumor suppression and prognostic prediction in multiple malignancies including CRC.
Project description:The present techniques of clinical and histopathological diagnosis hardly distinguish chromophobe renal cell carcinoma (ChRCC) from renal oncocytoma (RO). To identify differentially expressed genes (DEGs) as effective biomarkers for diagnosis and prognosis of ChRCC and RO, three mRNA microarray datasets (GSE12090, GSE19982, and GSE8271) were downloaded from the GEO database. Functional enrichment analysis of DEGs was performed by DAVID. STRING and Cytoscape were applied to construct the protein-protein interaction (PPI) network and key modules of DEGs. Visualized plots were conducted by the R language. We downloaded clinical data from the TCGA database and the influence of key genes on the overall survival of ChRCC was performed by Kaplan-Meier and Cox analyses. Gene set enrichment analysis (GSEA) was utilized in exploring the function of key genes. A total of 79 DEGs were identified. Enrichment analyses revealed that the DEGs are closely related to tissue invasion and metastasis of cancer. Subsequently, 14 hub genes including ESRP1, AP1M2, CLDN4, and CLDN7 were detected. Kaplan-Meier analysis indicated that the low expression of CLDN7 and GNAS was related to the worse overall survival in patients with ChRCC. Univariate Cox analysis showed that CLDN7 might be a helpful biomarker for ChRCC prognosis. Subgroup analysis revealed that the expression of CLDN7 showed a downtrend with the development of the clinical stage, topography, and distant metastasis of ChRCC. GSEA analysis identified that cell adhesion molecules cams, B cell receptor signaling pathway, T cell receptor signaling pathway, RIG-I like receptor signaling pathway, Toll-like receptor signaling pathway, and apoptosis pathway were associated with the expression of CLDN7. In conclusion, ESRP1, AP1M2, CLDN4, PRSS8, and CLDN7 were found to distinguish ChRCC from RO. Besides, the low expression of CLDN7 was closely related to ChRCC progression and could serve as an independent risk factor for the overall survival in patients with ChRCC.
Project description:Colorectal cancer (CRC) is one of the most common and deadly malignancies in the world. In China, the morbidity rate of CRC has increased during the period 2000 to 2011. Biomarker detection for early CRC diagnosis can effectively reduce the mortality of patients with CRC. To explore the underlying mechanisms of effective biomarkers and identify more of them, we performed weighted correlation network analysis (WGCNA) on a GSE68468 dataset generated from 378 CRC tissue samples. We screened the gene set (module), which was significantly associated with CRC histology, and analyzed the hub genes. The key genes were identified by obtaining six colorectal raw data (i.e., GSE25070, GSE44076, GSE44861, GSE21510, GSE9348, and GSE21815) from the GEO database (https://www.ncbi.nlm.nih.gov/geo). The robust differentially expressed genes (DEGs) in all six datasets were calculated and obtained using the library "RobustRankAggreg" package in R 3.5.1. An integrated analysis of CRC based on the top 50 downregulated DEGs and hub genes in the red module from WGCNA was conducted, and the intersecting genes were screened. The Kaplan-Meier plot was further analyzed, and the genes associated with CRC prognosis based on patients from the TCGA database were determined. Finally, we validated the candidate gene in our clinical CRC specimens. We postulated that the candidate genes screened from the database and verified by our clinical pathological data may contribute to understanding the molecular mechanisms of tumorigenesis and may serve as potential biomarkers for CRC diagnosis and treatment.
Project description:Glioblastoma (GBM), characterized by high morbidity and mortality, is one of the most common lethal diseases worldwide. To identify the molecular mechanisms that contribute to the development of GBM, three cohort profile datasets (GSE50161, GSE90598 and GSE104291) were integrated and thoroughly analyzed; these datasets included 57 GBM cases and 22 cases of normal brain tissue. The current study identified differentially expressed genes (DEGs), and analyzed potential candidate genes and pathways. Additionally, a DEGs-associated protein-protein interaction (PPI) network was established for further investigation. Then, the hub genes associated with prognosis were identified using a Kaplan-Meier analysis based on The Cancer Genome Atlas database. Firstly, the current study identified 378 consistent DEGs (240 upregulated and 138 downregulated). Secondly, a cluster analysis of the DEGs was performed based on functions of the DEGs and signaling pathways were analyzed using the enrichment analysis tool on DAVID. Thirdly, 245 DEGs were identified using PPI network analysis. Among them, two co-expression modules comprising of 30 and 27 genes, respectively, and 35 hub genes were identified using Cytoscape MCODE. Finally, Kaplan-Meier analysis of the hub genes revealed that the increased expression of calcium-binding protein 1 (CABP1) was negatively associated with relapse-free survival. To summarize, all enriched Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways may participate in mechanisms underlying GBM occurrence and progression, however further studies are required. CABP1 may be a key gene associated with the biological process of GBM development and may be involved in a crucial mechanism of GBM progression.
Project description:The present study aimed to explore important estrogen receptor-associated genes and to determine the potential pathogenic and prognostic factors for lung adenocarcinoma in non-smoking females. The gene expression profiles of the two datasets (GSE32863 and GSE75037) were downloaded from the Gene Expression Omnibus (GEO) database. Data for non-smoking female patients with lung adenocarcinoma from The Cancer Genome Atlas (TCGA) database were also downloaded. The Linear Models for Microarray Data package in R was used to explore the differentially expressed genes (DEGs) between samples from non-smoking female patients with lung adenocarcinoma and samples of adjacent non-cancerous lung tissue. The Database for Annotation, Visualization and Integrated Discovery was used for functional enrichment of the DEGs. The Search Tool for the Retrieval of Interacting Genes/Proteins and Cytoscape software were used to obtain a protein-protein interaction (PPI) network and to identify the hub genes. In addition, the network between the estrogen receptor and the DEGs was constructed. A Kaplan-Meier survival plot was used to analyze the overall survival (OS). In total, 248 DEGs were identified in the GEO database, and 2,362 DEGs were identified in TCGA database. The intersection of the two datasets (DEGs in GEO and TCGA) revealed 170 DEGs, and these were selected for further investigation. Gene Ontology was used to group the 170 DEGs into biological process, molecular function and cellular component categories. Kyoto Encyclopedia of Genes and Genomes pathway analysis was subsequently performed. A total of 27 hub genes, including caveolin 1 (CAV1), matrix metallopeptidase 9 (MMP9), secreted phosphoprotein 1 (SPP1) and collagen type I ? 1 chain (COL1A1), were closely associated with the estrogen receptor. CAV1 and SPP1 were associated with the OS. However, MMP9 and COL1A1 did not have any significant effect on OS. In summary, the identification of CAV1, MMP9, SPP1 and COL1A1 may provide novel insights into the molecular mechanism of lung adenocarcinoma in non-smoking female patients, and the results obtained in the current study may guide future clinical studies.
Project description:Colorectal cancer (CRC) is a prevalent malignant tumour type arising from the colon and rectum. The present study aimed to explore the molecular mechanisms of the development and progression of CRC. Initially, differentially expressed genes (DEGs) between CRC tissues and corresponding non-cancerous tissues were obtained by analysing the GSE15781 microarray dataset. The Database for Annotation, Visualization and Integrated Discovery was then utilized for functional and pathway enrichment analysis of the DEGs. Subsequently, a protein-protein interaction (PPI) network was created using the Search Tool for the Retrieval of Interacting Genes and Proteins database and visualized by Cytoscape software. Furthermore, CytoNCA, a Cytoscape plugin, was used for centrality analysis of the PPI network to identify crucial genes. Finally, UALCAN was employed to validate the expression of the crucial genes and to estimate their effect on the survival of patients with colon cancer by Kaplan-Meier curves and log-rank tests. A total of 1,085 DEGs, including 496 upregulated and 589 downregulated genes, were screened out. The DEGs identified were enriched in various pathways, including 'metabolic pathway', 'cell cycle', 'DNA replication', 'nitrogen metabolism', 'p53 signalling' and 'fatty acid degradation'. PPI network analysis suggested that interleukin-6, MYC, NOTCH1, inhibin subunit ?A (INHBA), CDK1, cyclin (CCN)B1 and CCNA2 were crucial genes, and their expression levels were markedly upregulated. Survival analysis suggested that upregulated INHBA significantly decreased the survival probability of patients with CRC. Conversely, upregulation of CCNB1 and CCNA2 expression levels were associated with increased survival probabalities. The identified DEGs, particularly the crucial genes, may enhance the current understanding of the genesis and progression of CRC, and certain genes, including INHBA, CCNB1 and CCNA2, may be candidate diagnostic and prognostic markers, as well as targets for the treatment of CRC.