Detecting phenotype-driven transitions in regulatory network structure.
ABSTRACT: Complex traits and diseases like human height or cancer are often not caused by a single mutation or genetic variant, but instead arise from functional changes in the underlying molecular network. Biological networks are known to be highly modular and contain dense "communities" of genes that carry out cellular processes, but these structures change between tissues, during development, and in disease. While many methods exist for inferring networks and analyzing their topologies separately, there is a lack of robust methods for quantifying differences in network structure. Here, we describe ALPACA (ALtered Partitions Across Community Architectures), a method for comparing two genome-scale networks derived from different phenotypic states to identify condition-specific modules. In simulations, ALPACA leads to more nuanced, sensitive, and robust module discovery than currently available network comparison methods. As an application, we use ALPACA to compare transcriptional networks in three contexts: angiogenic and non-angiogenic subtypes of ovarian cancer, human fibroblasts expressing transforming viral oncogenes, and sexual dimorphism in human breast tissue. In each case, ALPACA identifies modules enriched for processes relevant to the phenotype. For example, modules specific to angiogenic ovarian tumors are enriched for genes associated with blood vessel development, and modules found in female breast tissue are enriched for genes involved in estrogen receptor and ERK signaling. The functional relevance of these new modules suggests that not only can ALPACA identify structural changes in complex networks, but also that these changes may be relevant for characterizing biological phenotypes.
Project description:BACKGROUND:Breast cancer and ovarian cancer are hormone driven and are known to have some predisposition genes in common such as the two well known cancer genes BRCA1 and BRCA2. The objective of this study is to compare the coexpression network modules of both cancers, so as to infer the potential cancer-related modules. METHODS:We applied the eigen-decomposition to the matrix that integrates the gene coexpression networks of both breast cancer and ovarian cancer. With hierarchical clustering of the related eigenvectors, we obtained the network modules of both cancers simultaneously. Enrichment analysis on Gene Ontology (GO), KEGG pathway, Disease Ontology (DO), and Gene Set Enrichment Analysis (GSEA) in the identified modules was performed. RESULTS:We identified 43 modules that are enriched by at least one of the four types of enrichments. 31, 25, and 18 modules are enriched by GO terms, KEGG pathways, and DO terms, respectively. The structure of 29 modules in both cancers is significantly different with p-values less than 0.05, of which 25 modules have larger densities in ovarian cancer. One module was found to be significantly enriched by the terms related to breast cancer from GO, KEGG and DO enrichment. One module was found to be significantly enriched by ovarian cancer related terms. CONCLUSION:Breast cancer and ovarian cancer share some common properties on the module level. Integration of both cancers helps identifying the potential cancer associated modules.
Project description:BACKGROUND:We recently identified two robust ovarian cancer subtypes, defined by the expression of genes involved in angiogenesis, with significant differences in clinical outcome. To identify potential regulatory mechanisms that distinguish the subtypes we applied PANDA, a method that uses an integrative approach to model information flow in gene regulatory networks. RESULTS:We find distinct differences between networks that are active in the angiogenic and non-angiogenic subtypes, largely defined by a set of key transcription factors that, although previously reported to play a role in angiogenesis, are not strongly differentially-expressed between the subtypes. Our network analysis indicates that these factors are involved in the activation (or repression) of different genes in the two subtypes, resulting in differential expression of their network targets. Mechanisms mediating differences between subtypes include a previously unrecognized pro-angiogenic role for increased genome-wide DNA methylation and complex patterns of combinatorial regulation. CONCLUSIONS:The models we develop require a shift in our interpretation of the driving factors in biological networks away from the genes themselves and toward their interactions. The observed regulatory changes between subtypes suggest therapeutic interventions that may help in the treatment of ovarian cancer.
Project description:The use of biological networks such as protein-protein interaction and transcriptional regulatory networks is becoming an integral part of genomics research. However, these networks are not static, and during phenotypic transitions like disease onset, they can acquire new "communities" (or highly interacting groups) of genes that carry out cellular processes. Disease communities can be detected by maximizing a modularity-based score, but since biological systems and network inference algorithms are inherently noisy, it remains a challenge to determine whether these changes represent real cellular responses or whether they appeared by random chance. Here, we introduce Constrained Random Alteration of Network Edges (CRANE), a method for randomizing networks with fixed node strengths. CRANE can be used to generate a null distribution of gene regulatory networks that can in turn be used to rank the most significant changes in candidate disease communities. Compared to other approaches, such as consensus clustering or commonly used generative models, CRANE emulates biologically realistic networks and recovers simulated disease modules with higher accuracy. When applied to breast and ovarian cancer networks, CRANE improves the identification of cancer-relevant GO terms while reducing the signal from non-specific housekeeping processes.
Project description:Inflammation has been recognized as an important driver in the development and growth of malignancies. Inflammatory signaling in cancer emerges from the combinatorial interaction of several deregulated pathways. Pathway deregulation is often driven by changes in the underlying gene regulatory networks. Confronted with such complex scenario, it can be argued that a closer analysis of the structure of such regulatory networks will shed some light on how gene deregulation led to sustained inflammation in cancer. Here, we inferred an inflammation-associated gene regulatory network from 641 breast cancer and 78 healthy samples. A modular structure analysis of the regulatory network was carried out, revealing a hierarchical modular structure. Modules show significant overrepresentation score p-values for biological processes unveiling a definite association between inflammatory processes and adaptive immunity. Other modules are enriched for T-cell activation, differentiation of CD8+ lymphocytes and immune cell migration, thus reinforcing the aforementioned association. These analyses suggest that in breast cancer tumors, the balance between antitumor response and immune tolerance involving CD8+ T cells is tipped in favor of the tumor. One possible mechanism is the induction of tolerance and anergization of these cells by persistent antigen exposure.
Project description:Breast cancer is one of the leading causes of mortality in females. A number of prognostic markers have been identified, including single genes, multi-gene signatures and network modules; however, the robustness of these prognostic markers is insufficient. Thus, the present study proposed a more robust method to identify breast cancer prognostic modules based on weighted protein-protein interaction networks, by integrating four sets of disease-associated expression profiles. Three identified prognostic modules were closely associated with prognosis-associated functions and survival time, as determined by Cox regression and Kaplan-Meier survival analyses. The robustness of these modules was verified with an independent profile from another platform. Genes from these modules may be useful as breast cancer prognostic markers. The prognostic modules could be used to determine the prognoses of patients with breast cancer and characterize patient recovery.
Project description:Gene and protein expression changes observed with tumorigenesis are often interpreted independently of each other and out of context of biological networks. To address these limitations, this study examined several approaches to integrate transcriptomic and proteomic data with known protein-protein and signaling interactions in estrogen receptor positive (ER+) breast cancer tumors. An approach that built networks from differentially expressed proteins and identified among them networks enriched in differentially expressed genes yielded the greatest success. This method identified a set of genes and proteins linking pathways of cellular stress response, cancer metabolism, and tumor microenvironment. The proposed network underscores several biologically intriguing events not previously studied in the context of ER+ breast cancer, including the overexpression of p38 mitogen-activated protein kinase and the overexpression of poly(ADP-ribose) polymerase 1. A gene-based expression signature biomarker built from this network was significantly predictive of clinical relapse in multiple independent cohorts of ER+ breast cancer patients, even after correcting for standard clinicopathological variables. The results of this study demonstrate the utility and power of an integrated quantitative proteomic, transcriptomic, and network analysis approach to discover robust and clinically meaningful molecular changes in tumors.
Project description:Many human diseases including cancer are the result of perturbations to transcriptional regulatory networks that control context-specific expression of genes. A comparative approach across multiple cancer types is a powerful approach to illuminate the common and specific network features of this family of diseases. Recent efforts from The Cancer Genome Atlas (TCGA) have generated large collections of functional genomic data sets for multiple types of cancers. An emerging challenge is to devise computational approaches that systematically compare these genomic data sets across different cancer types that identify common and cancer-specific network components. We present a module- and network-based characterization of transcriptional patterns in six different cancers being studied in TCGA: breast, colon, rectal, kidney, ovarian, and endometrial. Our approach uses a recently developed regulatory network reconstruction algorithm, modular regulatory network learning with per gene information (MERLIN), within a stability selection framework to predict regulators for individual genes and gene modules. Our module-based analysis identifies a common theme of immune system processes in each cancer study, with modules statistically enriched for immune response processes as well as targets of key immune response regulators from the interferon regulatory factor (IRF) and signal transducer and activator of transcription (STAT) families. Comparison of the inferred regulatory networks from each cancer type identified a core regulatory network that included genes involved in chromatin remodeling, cell cycle, and immune response. Regulatory network hubs included genes with known roles in specific cancer types as well as genes with potentially novel roles in different cancer types. Overall, our integrated module and network analysis recapitulated known themes in cancer biology and additionally revealed novel regulatory hubs that suggest a complex interplay of immune response, cell cycle, and chromatin remodeling across multiple cancers.
Project description:Breast cancer is a heterogeneous and complex disease, a clear manifestation of this is its classification into different molecular subtypes. On the other hand, gene transcriptional networks may exhibit different modular structures that can be related to known biological processes. Thus, modular structures in transcriptional networks may be seen as manifestations of regulatory structures that tightly controls biological processes. In this work, we identify modular structures on gene transcriptional networks previously inferred from microarray data of molecular subtypes of breast cancer: luminal A, luminal B, basal, and HER2-enriched. We analyzed the modules (communities) found in each network to identify particular biological functions (described in the Gene Ontology database) associated to them. We further explored these modules and their associated functions to identify common and unique features that could allow a better level of description of breast cancer, particularly in the basal-like subtype, the most aggressive and poor prognosis manifestation. Our findings related to the immune system and a decrease in cell death-related processes in basal subtype could help to understand it and design strategies for its treatment.
Project description:Identifying functional modules or novel active pathways, recently termed de novo pathway enrichment, is a computational systems biology challenge that has gained much attention during the last decade. Given a large biological interaction network, KeyPathwayMiner extracts connected subnetworks that are enriched for differentially active entities from a series of molecular profiles encoded as binary indicator matrices. Since interaction networks constantly evolve, an important question is how robust the extracted results are when the network is modified. We enable users to study this effect through several network perturbation techniques and over a range of perturbation degrees. In addition, users may now provide a gold-standard set to determine how enriched extracted pathways are with relevant genes compared to randomized versions of the original network.
Project description:<h4>Background</h4>Since genes involved in the same biological modules usually present correlated expression profiles, lots of computational methods have been proposed to identify gene functional modules based on the expression profiles data. Recently, Sparse Singular Value Decomposition (SSVD) method has been proposed to bicluster gene expression data to identify gene modules. However, this model can only handle the gene expression data where no gene interaction information is integrated. Ignoring the prior gene interaction information may produce the identified gene modules hard to be biologically interpreted.<h4>Results</h4>In this paper, we develop a Sparse Network-regularized SVD (SNSVD) method that integrates a prior gene interaction network from a protein protein interaction network and gene expression data to identify underlying gene functional modules. The results on a set of simulated data show that SNSVD is more effective than the traditional SVD-based methods. The further experiment results on real cancer genomic data show that most co-expressed modules are not only significantly enriched on GO/KEGG pathways, but also correspond to dense sub-networks in the prior gene interaction network. Besides, we also use our method to identify ten differentially co-expressed miRNA-gene modules by integrating matched miRNA and mRNA expression data of breast cancer from The Cancer Genome Atlas (TCGA). Several important breast cancer related miRNA-gene modules are discovered.<h4>Conclusions</h4>All the results demonstrate that SNSVD can overcome the drawbacks of SSVD and capture more biologically relevant functional modules by incorporating a prior gene interaction network. These identified functional modules may provide a new perspective to understand the diagnostics, occurrence and progression of cancer.