Extended notions of sign consistency to relate experimental data to signaling and regulatory network topologies.
ABSTRACT: A rapidly growing amount of knowledge about signaling and gene regulatory networks is available in databases such as KEGG, Reactome, or RegulonDB. There is an increasing need to relate this knowledge to high-throughput data in order to (in)validate network topologies or to decide which interactions are present or inactive in a given cell type under a particular environmental condition. Interaction graphs provide a suitable representation of cellular networks with information flows and methods based on sign consistency approaches have been shown to be valuable tools to (i) predict qualitative responses, (ii) to test the consistency of network topologies and experimental data, and (iii) to apply repair operations to the network model suggesting missing or wrong interactions.We present a framework to unify different notions of sign consistency and propose a refined method for data discretization that considers uncertainties in experimental profiles. We furthermore introduce a new constraint to filter undesired model behaviors induced by positive feedback loops. Finally, we generalize the way predictions can be made by the sign consistency approach. In particular, we distinguish strong predictions (e.g. increase of a node level) and weak predictions (e.g., node level increases or remains unchanged) enlarging the overall predictive power of the approach. We then demonstrate the applicability of our framework by confronting a large-scale gene regulatory network model of Escherichia coli with high-throughput transcriptomic measurements.Overall, our work enhances the flexibility and power of the sign consistency approach for the prediction of the behavior of signaling and gene regulatory networks and, more generally, for the validation and inference of these networks.
Project description:Deciphering the design principles for regulatory networks is fundamental to an understanding of biological systems. We have explored the mapping from the space of network topologies to the space of dynamical phenotypes for small networks. Using exhaustive enumeration of a simple model of three- and four-node networks, we demonstrate that certain dynamical phenotypes can be generated by an atypically broad spectrum of network topologies. Such dynamical outputs are highly designable, much like certain protein structures can be designed by an unusually broad spectrum of sequences. The network topologies that encode a highly designable dynamical phenotype possess two classes of connections: a fully conserved core of dedicated connections that encodes the stable dynamical phenotype and a partially conserved set of variable connections that controls the transient dynamical flow. By comparing the topologies and dynamics of the three- and four-node network ensembles, we observe a large number of instances of the phenomenon of "mutational buffering," whereby addition of a fourth node suppresses phenotypic variation amongst a set of three-node networks.
Project description:<h4>Background</h4>Recently, the research community has seen an influx of data relating to transcriptional regulatory interactions of Corynebacteria, organisms that are highly relevant to fields of systems biology, biotechnology, and human medicine. Information derived from DNA microarray experiments, computational predictions, and literature has opened the way for the graph-based analysis, visualization, and reconstruction of transcriptional regulatory networks across entire organisms. The reference database for corynebacterial gene regulatory networks CoryneRegNet provides methods for data storage and data exchange in a well-structured manner. Additional information on the model organism Escherichia coli K12 obtained from RegulonDB has been integrated. Generally, gene regulatory networks can be visualized as graphs by drawing directed edges between nodes, where a node represents a gene and an edge corresponds to a typed regulatory interaction. Cytoscape is an open-source software project whose aim is to provide graph-based visualization and analysis for biological networks. Its architecture allows the development and integration of user-made plugins to enhance core functionalities.<h4>Results</h4>We introduce two novel plugins for the Cytoscape environment designed to enhance in silico studies of procaryotic transcriptional regulatory networks. Our plugins leverage the information from the cornyebacterial reference database CoryneRegNet with the graph analysis capabilities of Cytoscape. CoryneRegNetLoader queries the CoryneRegNet database to extract a gene regulatory network represented as a directed graph. Additional information is stored as node/edge attributes within the graph. COMA facilitates consistency checks for gene expression studies given a gene regulatory network. COMA tests whether all gene expression levels correlate properly with the given network topology.<h4>Conclusion</h4>The plugins facilitate in silico studies of procaryotic transcriptional gene regulation, particularly in Corynebacteria and E. coli, by combining the knowledge from the corynebacterial reference database with the graph analysis capabilities of Cytoscape, which is one of the most-widely used tools for biological network analyses.
Project description:Biochemical adaptation is one of the basic functions that are widely implemented in biological systems for a variety of purposes such as signal sensing, stress response and homeostasis. The adaptation time scales span from milliseconds to days, involving different regulatory machineries in different processes. The adaptive networks with enzymatic regulation (ERNs) have been investigated in detail. But it remains unclear if and how other forms of regulation will impact the network topology and other features of the function. Here, we systematically studied three-node transcriptional regulatory networks (TRNs), with three different types of gene regulation logics. We found that the topologies of adaptive gene regulatory networks can still be grouped into two general classes: negative feedback loop (NFBL) and incoherent feed-forward loop (IFFL), but with some distinct topological features comparing to the enzymatic networks. Specifically, an auto-activation loop on the buffer node is necessary for the NFBL class. For IFFL class, the control node can be either a proportional node or an inversely-proportional node. Furthermore, the tunability of adaptive behavior differs between TRNs and ERNs. Our findings highlight the role of regulation forms in network topology, implementation and dynamics.
Project description:Many signaling systems show adaptation-the ability to reset themselves after responding to a stimulus. We computationally searched all possible three-node enzyme network topologies to identify those that could perform adaptation. Only two major core topologies emerge as robust solutions: a negative feedback loop with a buffering node and an incoherent feedforward loop with a proportioner node. Minimal circuits containing these topologies are, within proper regions of parameter space, sufficient to achieve adaptation. More complex circuits that robustly perform adaptation all contain at least one of these topologies at their core. This analysis yields a design table highlighting a finite set of adaptive circuits. Despite the diversity of possible biochemical networks, it may be common to find that only a finite set of core topologies can execute a particular function. These design rules provide a framework for functionally classifying complex natural networks and a manual for engineering networks. For a video summary of this article, see the PaperFlick file with the Supplemental Data available online.
Project description:Gene regulatory networks (GRNs) and gene expression data form a core element of systems biology-based phenotyping. Changes in the expression of transcription factors are commonly believed to have a causal effect on the expression of their targets. Here we evaluated in the best researched model organism, Escherichia coli, the consistency between a GRN and a large gene expression compendium. Surprisingly, a modest correlation was observed between the expression of transcription factors and their targets and, most noteworthy, both activating and repressing interactions were associated with positive correlation. When evaluated using a sign consistency model we found the regulatory network was not more consistent with measured expression than random network models. We conclude that, at least in E. coli, one cannot expect a causal relationship between the expression of transcription and factors their targets, and that the current static GRN does not adequately explain transcriptional regulation. The implications of this are profound as they question what we consider established knowledge of the systemic biology of cells and point to methodological limitations with respect to single omics analysis, static networks and temporality.
Project description:BACKGROUND: Large-scale evaluation of gene expression variation among Caenorhabditis elegans lines that have diverged from a common ancestor allows for the analysis of a novel class of biological networks - evolutionary gene coexpression networks. Comparative analysis of these evolutionary networks has the potential to uncover the effects of natural selection in shaping coexpression network topologies since C. elegans mutation accumulation (MA) lines evolve essentially free from the effects of natural selection, whereas natural isolate (NI) populations are subject to selective constraints. RESULTS: We compared evolutionary gene coexpression networks for C. elegans MA lines versus NI populations to evaluate the role that natural selection plays in shaping the evolution of network topologies. MA and NI evolutionary gene coexpression networks were found to have very similar global topological properties as measured by a number of network topological parameters. Observed MA and NI networks show node degree distributions and average values for node degree, clustering coefficient, path length, eccentricity and betweeness that are statistically indistinguishable from one another yet highly distinct from randomly simulated networks. On the other hand, at the local level the MA and NI coexpression networks are highly divergent; pairs of genes coexpressed in the MA versus NI lines are almost entirely different as are the connectivity and clustering properties of individual genes. CONCLUSION: It appears that selective forces shape how local patterns of coexpression change over time but do not control the global topology of C. elegans evolutionary gene coexpression networks. These results have implications for the evolutionary significance of global network topologies, which are known to be conserved across disparate complex systems.
Project description:Networks are real systems modelled through mathematical objects made up of nodes and links arranged into peculiar and deliberate (or partially deliberate) topologies. Studying these real-world topologies allows for several properties of interest to be revealed. In real networks, nodes are also identified by a certain number of non-structural features or metadata. Given the current possibility of collecting massive quantity of such metadata, it becomes crucial to identify automatically which are the most relevant for the observed structure. We propose a new method that, independently from the network size, is able to not only report the relevance of binary node metadata, but also rank them. Such a method can be applied to networks from any domain, and we apply it in two heterogeneous cases: a temporal network of technology transfer and a protein-protein interaction network. Together with the relevance of node metadata, we investigate the redundancy of these metadata displaying by the results on a Redundancy-Relevance diagram, which is able to highlight the differences among vectors of metadata from both a structural and a non-structural point of view. The obtained results provide insights of a practical nature into the importance of the observed node metadata for the actual network structure.
Project description:A focused theme in systems biology is to uncover design principles of biological networks, that is, how specific network structures yield specific systems properties. For this purpose, we have previously developed a reverse engineering procedure to identify network topologies with high likelihood in generating desired systems properties. Our method searches the continuous parameter space of an assembly of network topologies, without enumerating individual network topologies separately as traditionally done in other reverse engineering procedures. Here we tested this CPSS (continuous parameter space search) method on a previously studied problem: the resettable bistability of an Rb-E2F gene network in regulating the quiescence-to-proliferation transition of mammalian cells. From a simplified Rb-E2F gene network, we identified network topologies responsible for generating resettable bistability. The CPSS-identified topologies are consistent with those reported in the previous study based on individual topology search (ITS), demonstrating the effectiveness of the CPSS approach. Since the CPSS and ITS searches are based on different mathematical formulations and different algorithms, the consistency of the results also helps cross-validate both approaches. A unique advantage of the CPSS approach lies in its applicability to biological networks with large numbers of nodes. To aid the application of the CPSS approach to the study of other biological systems, we have developed a computer package that is available in Information S1.
Project description:Protein-protein interactions (PPIs) and their networks play a central role in all biological processes. Akin to the complete sequencing of genomes and their comparative analysis, complete descriptions of interactomes and their comparative analysis is fundamental to a deeper understanding of biological processes. A first step in such an analysis is to align two or more PPI networks. Here, we introduce an algorithm, IsoRank, for global alignment of multiple PPI networks. The guiding intuition here is that a protein in one PPI network is a good match for a protein in another network if their respective sequences and neighborhood topologies are a good match. We encode this intuition as an eigenvalue problem in a manner analogous to Google's PageRank method. Using IsoRank, we compute a global alignment of the Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, and Homo sapiens PPI networks. We demonstrate that incorporating PPI data in ortholog prediction results in improvements over existing sequence-only approaches and over predictions from local alignments of the yeast and fly networks. Previous methods have been effective at identifying conserved, localized network patterns across pairs of networks. This work takes the further step of performing a global alignment of multiple PPI networks. It simultaneously uses sequence similarity and network data and, unlike previous approaches, explicitly models the tradeoff inherent in combining them. We expect IsoRank-with its simultaneous handling of node similarity and network similarity-to be applicable across many scientific domains.
Project description:BACKGROUND: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. RESULTS: In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. CONCLUSION: This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data.