DiNeR: a Differential graphical model for analysis of co-regulation Network Rewiring.
ABSTRACT: BACKGROUND:During transcription, numerous transcription factors (TFs) bind to targets in a highly coordinated manner to control the gene expression. Alterations in groups of TF-binding profiles (i.e. "co-binding changes") can affect the co-regulating associations between TFs (i.e. "rewiring the co-regulator network"). This, in turn, can potentially drive downstream expression changes, phenotypic variation, and even disease. However, quantification of co-regulatory network rewiring has not been comprehensively studied. RESULTS:To address this, we propose DiNeR, a computational method to directly construct a differential TF co-regulation network from paired disease-to-normal ChIP-seq data. Specifically, DiNeR uses a graphical model to capture the gained and lost edges in the co-regulation network. Then, it adopts a stability-based, sparsity-tuning criterion -- by sub-sampling the complete binding profiles to remove spurious edges -- to report only significant co-regulation alterations. Finally, DiNeR highlights hubs in the resultant differential network as key TFs associated with disease. We assembled genome-wide binding profiles of 104 TFs in the K562 and GM12878 cell lines, which loosely model the transition between normal and cancerous states in chronic myeloid leukemia (CML). In total, we identified 351 significantly altered TF co-regulation pairs. In particular, we found that the co-binding of the tumor suppressor BRCA1 and RNA polymerase II, a well-known transcriptional pair in healthy cells, was disrupted in tumors. Thus, DiNeR successfully extracted hub regulators and discovered well-known risk genes. CONCLUSIONS:Our method DiNeR makes it possible to quantify changes in co-regulatory networks and identify alterations to TF co-binding patterns, highlighting key disease regulators. Our method DiNeR makes it possible to quantify changes in co-regulatory networks and identify alterations to TF co-binding patterns, highlighting key disease regulators.
Project description:When averaged over the full yeast protein-protein interaction and transcriptional regulatory networks, protein hubs with many interaction partners or regulators tend to evolve significantly more slowly due to increased negative selection. However, genome-wide analysis of protein evolution in the subnetworks of associations involving yeast transcription factors (TFs) reveals that TF hubs do not tend to evolve significantly more slowly than TF non-hubs. This result holds for all four major types of TF hubs: interaction hubs, regulatory in-degree and out-degree hubs, as well as co-regulatory hubs that jointly regulate target genes with many TFs. Furthermore, TF regulatory in-degree hubs tend to evolve significantly more quickly than TF non-hubs. Most importantly, the correlations between evolutionary rate (K(A)/K(S)) and degrees for TFs are significantly more positive than those for generic proteins within the same global protein-protein interaction and transcriptional regulatory networks. Compared to generic protein hubs, TF hubs operate at a higher level in the hierarchical structure of cellular networks, and hence experience additional evolutionary forces (relaxed negative selection or positive selection through network rewiring). The striking difference between the evolution of TF hubs and generic protein hubs demonstrates that components within the same global network can be governed by distinct organizational and evolutionary principles.
Project description:This study exploits time, the relatively unexplored fourth dimension of gene regulatory networks (GRNs), to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. Our "just-in-time" analysis of time-series transcriptome data uncovered a temporal cascade of cis elements underlying dynamic N signaling. To infer transcription factor (TF)-target edges in a GRN, we applied a time-based machine learning method to 2,174 dynamic N-responsive genes. We experimentally determined a network precision cutoff, using TF-regulated genome-wide targets of three TF hubs (CRF4, SNZ, and CDF1), used to "prune" the network to 155 TFs and 608 targets. This network precision was reconfirmed using genome-wide TF-target regulation data for four additional TFs (TGA1, HHO5/6, and PHL1) not used in network pruning. These higher-confidence edges in the GRN were further filtered by independent TF-target binding data, used to calculate a TF "N-specificity" index. This refined GRN identifies the temporal relationship of known/validated regulators of N signaling (NLP7/8, TGA1/4, NAC4, HRS1, and LBD37/38/39) and 146 additional regulators. Six TFs-CRF4, SNZ, CDF1, HHO5/6, and PHL1-validated herein regulate a significant number of genes in the dynamic N response, targeting 54% of N-uptake/assimilation pathway genes. Phenotypically, inducible overexpression of CRF4 in planta regulates genes resulting in altered biomass, root development, and 15NO3- uptake, specifically under low-N conditions. This dynamic N-signaling GRN now provides the temporal "transcriptional logic" for 155 candidate TFs to improve nitrogen use efficiency with potential agricultural applications. Broadly, these time-based approaches can uncover the temporal transcriptional logic for any biological response system in biology, agriculture, or medicine.
Project description:Network Component Analysis (NCA) has been used to deduce the activities of transcription factors (TFs) from gene expression data and the TF-gene binding relationship. However, the TF-gene interaction varies in different environmental conditions and tissues, but such information is rarely available and cannot be predicted simply by motif analysis. Thus, it is beneficial to identify key TF-gene interactions under the experimental condition based on transcriptome data. Such information would be useful in identifying key regulatory pathways and gene markers of TFs in further studies.We developed an algorithm to trim network connectivity such that the important regulatory interactions between the TFs and the genes were retained and the regulatory signals were deduced. Theoretical studies demonstrated that the regulatory signals were accurately reconstructed even in the case where only three independent transcriptome datasets were available. At least 80% of the main target genes were correctly predicted in the extreme condition of high noise level and small number of datasets. Our algorithm was tested with transcriptome data taken from mice under rapamycin treatment. The initial network topology from the literature contains 70 TFs, 778 genes, and 1423 edges between the TFs and genes. Our method retained 1074 edges (i.e. 75% of the original edge number) and identified 17 TFs as being significantly perturbed under the experimental condition. Twelve of these TFs are involved in MAPK signaling or myeloid leukemia pathways defined in the KEGG database, or are known to physically interact with each other. Additionally, four of these TFs, which are Hif1a, Cebpb, Nfkb1, and Atf1, are known targets of rapamycin. Furthermore, the trimmed network was able to predict Eno1 as an important target of Hif1a; this key interaction could not be detected without trimming the regulatory network.The advantage of our new algorithm, relative to the original NCA, is that our algorithm can identify the important TF-gene interactions. Identifying the important TF-gene interactions is crucial for understanding the roles of pleiotropic global regulators, such as p53. Also, our algorithm has been developed to overcome NCA's inability to analyze large networks where multiple TFs regulate a single gene. Thus, our algorithm extends the applicability of NCA to the realm of mammalian regulatory network analysis.
Project description:BACKGROUND:Inferring the mechanisms that drive transcriptional regulation is of great interest to biologists. Generally, methods that predict physical interactions between transcription factors (TFs) based on positional information of their binding sites (e.g. chromatin immunoprecipitation followed by sequencing (ChIP-Seq) experiments) cannot distinguish between different kinds of interaction at the same binding spots, such as co-operation and competition. RESULTS:In this work, we present the Network-Augmented Transcriptional Interaction and Coregulation Analyser (NAUTICA), which employs information from protein-protein interaction (PPI) networks to assign TF-TF interaction candidates to one of three classes: competition, co-operation and non-interactions. NAUTICA filters available PPI network edges and fits a prediction model based on the number of shared partners in the PPI network between two candidate interactors. CONCLUSIONS:NAUTICA improves on existing positional information-based TF-TF interaction prediction results, demonstrating how PPI information can improve the quality of TF interaction prediction. NAUTICA predictions - both co-operations and competitions - are supported by literature investigation, providing evidence on its capability of providing novel interactions of both kinds. REVIEWERS:This article was reviewed by Zoltán Hegedüs and Endre Barta.
Project description:Integrated analysis of multiple genome-wide transcription factor (TF)-binding profiles will be vital to advance our understanding of the global impact of TF binding. However, existing methods for measuring similarity in large numbers of chromatin immunoprecipitation assays with sequencing (ChIP-seq), such as correlation, mutual information or enrichment analysis, are limited in their ability to display functionally relevant TF relationships. In this study, we propose the use of graphical models to determine conditional independence between TFs and showed that network visualization provides a promising alternative to distinguish 'direct' versus 'indirect' TF interactions. We applied four algorithms to measure 'direct' dependence to a compendium of 367 mouse haematopoietic TF ChIP-seq samples and obtained a consensus network known as a 'TF association network' where edges in the network corresponded to likely causal pairwise relationships between TFs. The 'TF association network' illustrates the role of TFs in developmental pathways, is reminiscent of combinatorial TF regulation, corresponds to known protein-protein interactions and indicates substantial TF-binding reorganization in leukemic cell types. With the rapid increase in TF ChIP-Seq data sets, the approach presented here will be a powerful tool to study transcriptional programmes across a wide range of biological systems.
Project description:Suitability of wood biomass for pulp production is dependent on the cellular architecture and composition of secondary cell wall. Presently, systems genetics approach is being employed to understand the molecular basis of trait variation and co-expression network analysis has enabled holistic understanding of complex trait such as secondary development. Transcription factors (TFs) are reported as key regulators of meristematic growth and wood formation. The hierarchical TF network is a multi-layered system which interacts with downstream structural genes involved in biosynthesis of cellulose, hemicelluloses and lignin. Several TFs have been associated with wood formation in tree species such as Populus, Eucalyptus, Picea and Pinus. However, TF-specific co-expression networks to understand the interaction between these regulators are not reported. In the present study, co-expression network was developed for TFs expressed during wood formation in Eucalyptus tereticornis and ethylene-responsive element-binding factor, EtERF2, was identified as the major hub transcript which co-expressed with other secondary cell wall biogenesis-specific TFs such as EtSND2, EtVND1, EtVND4, EtVND6, EtMYB70, EtGRAS and EtSCL8. This study reveals a probable role of ethylene in determining natural variation in wood properties in Eucalyptus species. Understanding this transcriptional regulation underpinning the complex bio-processing trait of wood biomass will complement the Eucalyptus breeding program through selection of industrially suitable phenotypes by marker-assisted selection.
Project description:Osteosarcomas (OS) are complex bone tumors with various genomic alterations. These alterations affect the expression and function of several genes due to drastic changes in the underlying gene regulatory network. However, we know little about critical gene regulators and their functional consequences on the pathogenesis of OS. Therefore, we aimed to determine microRNA and transcription factor (TF) co-regulatory networks in OS cell proliferation. Cell proliferation is an essential part in the pathogenesis of OS and deeper understanding of its regulation might help to identify potential therapeutic targets. Based on expression data of OS cell lines divided according to their proliferative activity, we obtained 12 proliferation-related microRNAs and corresponding target genes. Therewith, microRNA and TF co-regulatory networks were generated and analyzed regarding their structure and functional influence. We identified key co-regulators comprising the microRNAs miR-9-5p, miR-138, and miR-214 and the TFs SP1 and MYC in the derived networks. These regulators are implicated in NFKB- and RB1-signaling and focal adhesion processes based on their common or interacting target genes (e.g., CDK6, CTNNB1, E2F4, HES1, ITGA6, NFKB1, NOTCH1, and SIN3A). Thus, we proposed a model of OS cell proliferation which is primarily co-regulated through the interactions of the mentioned microRNA and TF combinations. This study illustrates the benefit of systems biological approaches in the analysis of complex diseases. We integrated experimental data with publicly available information to unravel the coordinated (post)-transcriptional control of microRNAs and TFs to identify potential therapeutic targets in OS. The resulting microRNA and TF co-regulatory networks are publicly available for further exploration to generate or evaluate own hypotheses of the pathogenesis of OS (http://www.complex-systems.uni-muenster.de/co_networks.html).
Project description:BACKGROUND: Plant secondary metabolites are critical to various biological processes. However, the regulations of these metabolites are complex because of regulatory rewiring or crosstalk. To unveil how regulatory behaviors on secondary metabolism reshape biological processes, we constructed and analyzed a dynamic regulatory network of secondary metabolic pathways in Arabidopsis. RESULTS: The dynamic regulatory network was constructed through integrating co-expressed gene pairs and regulatory interactions. Regulatory interactions were either predicted by conserved transcription factor binding sites (TFBSs) or proved by experiments. We found that integrating two data (co-expression and predicted regulatory interactions) enhanced the number of highly confident regulatory interactions by over 10% compared with using single data. The dynamic changes of regulatory network systematically manifested regulatory rewiring to explain the mechanism of regulation, such as in terpenoids metabolism, the regulatory crosstalk of RAV1 (AT1G13260) and ATHB1 (AT3G01470) on HMG1 (hydroxymethylglutaryl-CoA reductase, AT1G76490); and regulation of RAV1 on epoxysqualene biosynthesis and sterol biosynthesis. Besides, we investigated regulatory rewiring with expression, network topology and upstream signaling pathways. Regulatory rewiring was revealed by the variability of genes' expression: pathway genes and transcription factors (TFs) were significantly differentially expressed under different conditions (such as terpenoids biosynthetic genes in tissue experiments and E2F/DP family members in genotype experiments). Both network topology and signaling pathways supported regulatory rewiring. For example, we discovered correlation among the numbers of pathway genes, TFs and network topology: one-gene pathways (such as ?-carotene biosynthesis) were regulated by a fewer TFs, and were not critical to metabolic network because of their low degrees in topology. Upstream signaling pathways of 50 TFs were identified to comprehend the underlying mechanism of TFs' regulatory rewiring. CONCLUSION: Overall, this dynamic regulatory network largely improves the understanding of perplexed regulatory rewiring in secondary metabolism in Arabidopsis.
Project description:Gene duplication results in two identical paralogs that diverge through mutation, leading to loss or gain of interactions with other biomolecules. Here, we comprehensively characterize such network rewiring for C. elegans transcription factors (TFs) within and across four newly delineated molecular networks. Remarkably, we find that even highly similar TFs often have different interaction degrees and partners. In addition, we find that most TF families have a member that is highly connected in multiple networks. Further, different TF families have opposing correlations between network connectivity and phylogenetic age, suggesting that they are subject to different evolutionary pressures. Finally, TFs that have similar partners in one network generally do not in another, indicating a lack of pressure to retain cross-network similarity. Our multiparameter analyses provide unique insights into the evolutionary dynamics that shaped TF networks.
Project description:Gene expression is controlled by sequence-specific transcription factors (TFs), which bind to regulatory sequences in DNA. TF binding occurs in nucleosome-depleted regions of DNA (NDRs), which generally encompass regions with lengths similar to those protected by nucleosomes. However, less is known about where within these regions specific TFs tend to be found. Here, we characterize the positional bias of inferred binding sites for 103 TFs within ?500,000 NDRs across 47 cell types. We find that distinct classes of TFs display different binding preferences: Some tend to have binding sites toward the edges, some toward the center, and some at other positions within the NDR. These patterns are highly consistent across cell types, suggesting that they may reflect TF-specific intrinsic structural or functional characteristics. In particular, TF classes with binding sites at NDR edges are enriched for those known to interact with histones and chromatin remodelers, whereas TFs with central enrichment interact with other TFs and cofactors such as p300. Our results suggest distinct regiospecific binding patterns and functions of TF classes within enhancers.