MiningABs: mining associated biomarkers across multi-connected gene expression datasets.
ABSTRACT: Human disease often arises as a consequence of alterations in a set of associated genes rather than alterations to a set of unassociated individual genes. Most previous microarray-based meta-analyses identified disease-associated genes or biomarkers independent of genetic interactions. Therefore, in this study, we present the first meta-analysis method capable of taking gene combination effects into account to efficiently identify associated biomarkers (ABs) across different microarray platforms.We propose a new meta-analysis approach called MiningABs to mine ABs across different array-based datasets. The similarity between paired probe sequences is quantified as a bridge to connect these datasets together. The ABs can be subsequently identified from an "improved" common logit model (c-LM) by combining several sibling-like LMs in a heuristic genetic algorithm selection process. Our approach is evaluated with two sets of gene expression datasets: i) 4 esophageal squamous cell carcinoma and ii) 3 hepatocellular carcinoma datasets. Based on an unbiased reciprocal test, we demonstrate that each gene in a group of ABs is required to maintain high cancer sample classification accuracy, and we observe that ABs are not limited to genes common to all platforms. Investigating the ABs using Gene Ontology (GO) enrichment, literature survey, and network analyses indicated that our ABs are not only strongly related to cancer development but also highly connected in a diverse network of biological interactions.The proposed meta-analysis method called MiningABs is able to efficiently identify ABs from different independently performed array-based datasets, and we show its validity in cancer biology via GO enrichment, literature survey and network analyses. We postulate that the ABs may facilitate novel target and drug discovery, leading to improved clinical treatment. Java source code, tutorial, example and related materials are available at "http://sourceforge.net/projects/miningabs/".
Project description:Combined abiotic stress (CAbS) affects the field grown plants simultaneously. The multigenic and quantitative nature of uncontrollable abiotic stresses complicates the process of understanding the stress response by plants. Considering this, we analyzed the CAbS response of C3 model plant, <i>Oryza sativa</i> by meta-analysis. The datasets of commonly expressed genes by drought, salinity, submergence, metal, natural expression, biotic, and abiotic stresses were data mined through publically accessible transcriptomic abiotic stress (AbS) responsive datasets. Of which 1,175, 12,821, and 42,877 genes were commonly expressed in meta differential, individual differential, and unchanged expressions respectively. Highly regulated 100 differentially expressed AbS genes were derived through integrative meta-analysis of expression data (INMEX). Of this 30 genes were identified from AbS gene families through expression atlas that were computationally analyzed for their physicochemical properties. All AbS genes were physically mapped against <i>O. sativa</i> genome. Comparative mapping of these genes demonstrated the orthologous relationship with related C4 panicoid genome. <i>In silico</i> expression analysis of these genes showed differential expression patterns in different developmental tissues. Protein-protein interaction of these genes, represented the complexity of AbS. Computational expression profiling of candidate genes in response to multiple stresses suggested the putative involvement of OS05G0350900, OS02G0612700, OS05G0104200, OS03G0596200, OS12G0225900, OS07G0152000, OS08G0119500, OS06G0594700, and Os01g0393100 in CAbS. These potential candidate genes need to be studied further to decipher their functional roles in AbS dynamics.
Project description:BACKGROUND: To uncover the genes involved in the development of osteosarcoma (OS), we performed a meta-analysis of OS microarray data to identify differentially expressed genes (DEGs) and biological functions associated with gene expression changes between OS and normal control (NC) tissues. METHODS: We used publicly available GEO datasets of OS to perform a meta-analysis. We performed Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and Protein-Protein interaction (PPI) networks analysis. RESULTS: Eight GEO datasets, including 240 samples of OS and 35 samples of controls, were available for the meta-analysis. We identified 979 DEGs across the studies between OS and NC tissues (472 up-regulated and 507 down-regulated). We found GO terms for molecular functions significantly enriched in protein binding (GO: 0005515, P = 3.83E-60) and calcium ion binding (GO: 0005509, P?=?3.79E-13), while for biological processes, the enriched GO terms were cell adhesion (GO:0007155, P?=?2.26E-19) and negative regulation of apoptotic process (GO: 0043066, P?=?3.24E-15), and for cellular component, the enriched GO terms were cytoplasm (GO: 0005737, P?=?9.18E-63) and extracellular region (GO: 0005576, P?=?2.28E-47). The most significant pathway in our KEGG analysis was Focal adhesion (P?=?5.70E-15). Furthermore, ECM-receptor interaction (P?=?1.27E-13) and Cell cycle (P?=?4.53E-11) are found to be highly enriched. PPI network analysis indicated that the significant hub proteins containing PTBP2 (Degree?=?33), RGS4 (Degree?=?15) and FXYD6 (Degree?=?13). CONCLUSIONS: Our meta-analysis detected DEGs and biological functions associated with gene expression changes between OS and NC tissues, guiding further identification and treatment for OS.
Project description:Clear cell renal cell carcinoma (ccRCC) was the most aggressive histological type of renal cell carcinoma (RCC) and accounted for 70-80% of cases of all RCC. The aim of this study was to identify the potential biomarker in ccRCC and explore their underlying mechanisms. Four profile datasets were downloaded from the GEO database to identify DEGs. GO and KEGG analysis of DEGs were performed by DAVID. A protein-protein interaction (PPI) network was constructed to predict hub genes. The hub gene expression within ccRCC across multiple datasets and the overall survival analysis were investigated utilizing the Oncomine Platform and UALCAN dataset, separately. A meta-analysis was performed to explore the relationship between the hub genes: EGFR and ccRCC. 127 DEGs (55 upregulated genes and 72 downregulated genes) were identified from four profile datasets. Integrating the result from PPI network, Oncomine Platform, and survival analysis, EGFR, FLT1, and EDN1 were screened as key factors in the prognosis of ccRCC. GO and KEGG analysis revealed that 127 DEGs were mainly enriched in 21 terms and 4 pathways. The meta-analysis showed that there was a significant difference of EGFR expression between ccRCC tissues and normal tissues, and the expression of EGFR in patients with metastasis was higher. This study identified 3 importance genes (EGFR, FLT1, and EDN1) in ccRCC, and EGFR may be a potential prognostic biomarker and novel therapeutic target for ccRCC, especially patients with metastasis.
Project description:The aim of the present study was to identify differentially expressed (DE) genes in patients with osteoarthritis (OA), and biological processes associated with changes in gene expression that occur in this disease. Using the INMEX (integrative meta?analysis of expression data) software tool, a meta?analysis of publicly available microarray Gene Expression Omnibus (GEO) datasets of OA was performed. Gene ontology (GO) enrichment analysis was performed in order to detect enriched functional attributes based on gene?associated GO terms. Three GEO datasets, containing 137 patients with OA and 52 healthy controls, were included in the meta?analysis. The analysis identified 85 genes that were consistently differentially expressed in OA (30 genes were upregulated and 55 genes were downregulated). The upregulated gene with the lowest P?value (P=5.36E?07) was S?phase kinase?associated protein 2, E3 ubiquitin protein ligase (SKP2). The downregulated gene with the lowest P?value (P=4.42E?09) was Proline rich 5 like (PRR5L). Among the 210 GO terms that were associated with the set of DE genes, the most significant two enrichments were observed in the GO categories of 'Immune response', with a P?value of 0.000129438, and 'Immune effectors process', with a P?value of 0.000288619. The current meta?analysis identified genes that were consistently DE in OA, in addition to biological pathways associated with changes in gene expression that occur during OA, which may provide insight into the molecular mechanisms underlying the pathogenesis of this disease.
Project description:Cross-species translation of genomic information may play a pivotal role in applying biological knowledge gained from relatively simple model system to other less studied, but related, genomes. The information of abiotic stress (ABS)-responsive genes in Arabidopsis was identified and translated into the legume model system, Medicago truncatula. Various data resources, such as TAIR/AtGI DB, expression profiles and literatures, were used to build a genome-wide list of ABS genes. tBlastX/BlastP similarity search tools and manual inspection of alignments were used to identify orthologous genes between the two genomes. A total of 1,377 genes were finally collected and classified into 18 functional criteria of gene ontology (GO). The data analysis according to the expression cues showed that there was substantial level of interaction among three major types (i.e., drought, salinity and cold stress) of abiotic stresses. In an attempt to translate the ABS genes between these two species, genomic locations for each gene were mapped using an in-house-developed comparative analysis platform. The comparative analysis revealed that fragmental colinearity, represented by only 37 synteny blocks, existed between Arabidopsis and M. truncatula. Based on the combination of E-value and alignment remarks, estimated translation rate was 60.2% for this cross-family translation. As a prelude of the functional comparative genomic approaches, in-silico gene network/interactome analyses were conducted to predict key components in the ABS responses, and one of the sub-networks was integrated with corresponding comparative map. The results demonstrated that core members of the sub-network were well aligned with previously reported ABS regulatory networks. Taken together, the results indicate that network-based integrative approaches of comparative and functional genomics are important to interpret and translate genomic information for complex traits such as abiotic stresses.
Project description:The goal of this study was to identify potential transcriptomic markers in developing ankylosing spondylitis by a meta-analysis of multiple public microarray datasets. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed (DE) genes in ankylosing spondylitis and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DE genes identified in the meta-analysis. Three microarray datasets (26 cases and 29 controls in total) were collected for meta-analysis. 905 consistently DE genes were identified in ankylosing spondylitis, among which 482 genes were upregulated and 423 genes were downregulated. The upregulated gene with the smallest combined rank product (RP) was GNG11 (combined RP=299.64). The downregulated gene with the smallest combined RP was S100P (combined RP=335.94). In the gene ontology (GO) analysis, the most significantly enriched GO term was "immune system process" (P=3.46×10(-26)). The most significant pathway identified in the pathway analysis was antigen processing and presentation (P=8.40×10(-5)). The consistently DE genes in ankylosing spondylitis and biological pathways associated with those DE genes identified provide valuable information for studying the pathophysiology of ankylosing spondylitis.
Project description:For understanding complex biological systems, a systems biology approach, involving both the top-down and bottom-up analyses, is often required. Numerous system components and their connections are best characterised as networks, which are primarily represented as graphs, with several nodes connected at multiple edges. Inefficient network visualisation is a common problem related to transcriptomic and genomic datasets. In this article, we demonstrate an miRNA analysis framework with the help of Jatropha curcas healthy and disease transcriptome datasets, functioning as a pipeline derived from the graph theory universe, and discuss how the network theory, along with gene ontology (GO) analysis, can be used to infer biological properties and other important features of a network. Network profiling, combined with GO, correlation, and co-expression analyses, can aid in efficiently understanding the biological significance of pathways, networks, as well as a studied system. The proposed framework may help experimental and computational biologists to analyse their own data and infer meaningful biological information.
Project description:The present study aimed to identify key genes involved in osteoarthritis (OA). Based on a bioinformatics analysis of five gene expression profiling datasets (GSE55457, GSE55235, GSE82107, GSE12021 and GSE1919), differentially expressed genes (DEGs) in OA were identified. Subsequently, a protein-protein interaction (PPI) network was constructed and its topological structure was analyzed. In addition, key genes in OA were identified following a principal component analysis (PCA) based on the DEGs in the PPI network. Finally, the functions and pathways enriched by these key genes were also analyzed. The PPI network consisted of 241 nodes and 576 interactives, including a total of 171 upregulated DEGs [e.g., aspartylglucosaminidase (AGA), CD58 and CD86] and a total of 70 downregulated DEGs (e.g., acetyl-CoA carboxylase ? and dihydropyrimidine dehydrogenase). The PPI network complied with an attribute of scale-free small-world network. After PCA, 47 key genes were identified, including ?-1,4-galactosyltransferase-1 (B4GALT1), AGA, CD58, CD86, ezrin, and eukaryotic translation initiation factor 4 ? 1 (EIF4G1). Subsequently, the 47 key genes were identified to be enriched in 13 Gene Ontology (GO) terms and 2 Kyoto Encyclopedia of Genes and Genomes pathways, with the GO terms involving B4GALT1 including positive regulation of developmental processes, protein amino acid terminal glycosylation and protein amino acid terminal N-glycosylation. In addition, B4GALT1 and EIF4G1 were confirmed to be downregulated in OA samples compared with healthy controls, but only EIF4G1 was determined to be significantly downregulated in OA samples, as determined via a meta-analysis of the 5 abovementioned datasets. In conclusion, B4GALT1 and EIF4G1 were indicated to have significant roles in OA, and B4GALT1 may be involved in positive regulation of developmental processes, protein amino acid terminal glycosylation and protein amino acid terminal N-glycosylation. The present study may enhance the current understanding of the molecular mechanisms of OA and provide novel therapeutic targets.
Project description:Objective:We aimed to explore potential molecular mechanisms of clear cell renal cell carcinoma (ccRCC) and provide candidate target genes for ccRCC gene therapy. Materials and Methods:This is a bioinformatics-based study. Microarray datasets of GSE6344, GSE781 and GSE53000 were downloaded from Gene Expression Omnibus database. Using meta-analysis, differentially expressed genes (DEGs) were identified between ccRCC and normal samples, followed by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) function analyses. Then, protein-protein interaction (PPI) networks and modules were investigated. Furthermore, miRNAs-target gene regulatory network was constructed. Results:Total of 511 up-regulated and 444 down-regulated DEGs were determined in the present gene expression microarray data meta-analysis. These DEGs were enriched in functions like immune system process and pathways like Toll-like receptor signaling pathway. PPI network and eight modules were further constructed. A total of 10 outstanding DEGs including TYRO protein tyrosine kinase binding protein (TYROBP), interferon regulatory factor 7 (IRF7) and PPARG co-activator 1 alpha (PPARGC1A) were detected in PPI network. Furthermore, the miRNAs-target gene regulation analyses showed that miR-412 and miR-199b respectively targeted IRF7 and PPARGC1A to regulate the immune response in ccRCC. Conclusion:TYROBP, IRF7 and PPARGC1A might play important roles in ccRCC via taking part in the immune system process.
Project description:MicroRNA (miR)-338-5p has been studied in hepatocellular carcinoma (HCC); however, the diagnostic value and molecular mechanism underlying its actions remains to be elucidated. The present study aimed to validate the diagnostic ability of miR?338?5p and further explore the underlying molecular mechanism. Data from eligible studies, Gene Expression Omnibus (GEO) chips and The Cancer Genome Atlas (TCGA) datasets were gathered in the data mining and the integrated meta?analysis, to evaluate the significance of miR?338?5p in diagnosing HCC comprehensively. The potential target genes of miR?338?5p were achieved from the intersection of the deregulated targets of miR?338?5p from GEO and TCGA in addition to the predicted target genes from 12 online software. A protein?protein?interaction (PPI) network was drawn to illustrate the interaction between target genes and to define the hub genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed to investigate the function of the target genes. From the results, miR?338?5p exhibited favorable value in diagnosing HCC. Types of sample and experiment were defined as the possible sources of heterogeneity in meta?analysis. A total of 423 genes were selected as the potential target genes of miR?338?5p, and five genes were defined as the hub genes from the PPI network. The GO and KEGG analyses indicated that the target genes were significantly assembled in the pathways of metabolic process and cell cycle. miR?338?5p may function as a novel diagnostic target for HCC through regulating certain target genes and signaling pathways.