ICOSSY: An Online Tool for Context-Specific Subnetwork Discovery from Gene Expression Data.
ABSTRACT: Pathway analyses help reveal underlying molecular mechanisms of complex biological phenotypes. Biologists tend to perform multiple pathway analyses on the same dataset, as there is no single answer. It is often inefficient for them to implement and/or install all the algorithms by themselves. Online tools can help the community in this regard. Here we present an online gene expression analytical tool called iCOSSY which implements a novel pathway-based COntext-specific Subnetwork discoverY (COSSY) algorithm. iCOSSY also includes a few modifications of COSSY to increase its reliability and interpretability. Users can upload their gene expression datasets, and discover important subnetworks of closely interacting molecules to differentiate between two phenotypes (context). They can also interactively visualize the resulting subnetworks. iCOSSY is a web server that finds subnetworks that are differentially expressed in two phenotypes. Users can visualize the subnetworks to understand the biology of the difference.
Project description:The biological regulatory system is highly dynamic. The correlations between many functionally related genes change over different biological conditions. Finding dynamic relations on the existing biological network may reveal important regulatory mechanisms. Currently no method is available to detect subnetwork-level dynamic correlations systematically on the genome-scale network. Two major issues hampered the development. The first is gene expression profiling data usually do not contain time course measurements to facilitate the analysis of dynamic relations, which can be partially addressed by using certain genes as indicators of biological conditions. Secondly, it is unclear how to effectively delineate subnetworks, and define dynamic relations between them.Here we propose a new method named LANDD (Liquid Association for Network Dynamics Detection) to find subnetworks that show substantial dynamic correlations, as defined by subnetwork A is concentrated with Liquid Association scouting genes for subnetwork B. The method produces easily interpretable results because of its focus on subnetworks that tend to comprise functionally related genes. Also, the collective behaviour of genes in a subnetwork is a much more reliable indicator of underlying biological conditions compared to using single genes as indicators. We conducted extensive simulations to validate the method's ability to detect subnetwork-level dynamic correlations. Using a real gene expression dataset and the human protein-protein interaction network, we demonstrate the method links subnetworks of distinct biological processes, with both confirmed relations and plausible new functional implications. We also found signal transduction pathways tend to show extensive dynamic relations with other functional groups.The R package is available at https://cran.r-project.org/web/packages/LANDD CONTACTS: email@example.com, firstname.lastname@example.org or email@example.comSupplementary information: Supplementary data are available at Bioinformatics online.
Project description:Metscape is a plug-in for Cytoscape, used to visualize and interpret metabolomic data in the context of human metabolic networks. We have developed a metabolite database by extracting and integrating information from several public sources. By querying this database, Metscape allows users to trace the connections between metabolites and genes, visualize compound networks and display compound structures as well as information for reactions, enzymes, genes and pathways. Applying the pathway filter, users can create subnetworks that consist of compounds and reactions from a given pathway. Metscape allows users to upload experimental data, and visualize and explore compound networks over time, or experimental conditions. Color and size of the nodes are used to visualize these dynamic changes. Metscape can display the entire metabolic network or any of the pathway-specific networks that exist in the database.Metscape can be installed from within Cytoscape 2.6.x under 'Network and Attribute I/O' category. For more information, please visit http://metscape.ncibi.org/tryplugin.html.
Project description:Online health communities (OHCs) provide a convenient and commonly used way for people to connect around shared health experiences, exchange information, and receive social support. Users often interact with peers via multiple communication methods, forming a multirelational social network. Use of OHCs is common among smokers, but to date, there have been no studies on users' online interactions via different means of online communications and how such interactions are related to smoking cessation. Such information can be retrieved in multirelational social networks and could be useful in the design and management of OHCs.To examine the social network structure of an OHC for smoking cessation using a multirelational approach, and to explore links between subnetwork position (ie, centrality) and smoking abstinence.We used NetworkX to construct 4 subnetworks based on users' interactions via blogs, group discussions, message boards, and private messages. We illustrated topological properties of each subnetwork, including its degree distribution, density, and connectedness, and compared similarities among these subnetworks by correlating node centrality and measuring edge overlap. We also investigated coevolution dynamics of this multirelational network by analyzing tie formation sequences across subnetworks. In a subset of users who participated in a randomized, smoking cessation treatment trial, we conducted user profiling based on users' centralities in the 4 subnetworks and identified user groups using clustering techniques. We further examined 30-day smoking abstinence at 3 months postenrollment in relation to users' centralities in the 4 subnetworks.The 4 subnetworks have different topological characteristics, with message board having the most nodes (36,536) and group discussion having the highest network density (4.35×10(-3)). Blog and message board subnetworks had the most similar structures with an in-degree correlation of .45, out-degree correlation of .55, and Jaccard coefficient of .23 for edge overlap. A new tie in the group discussion subnetwork had the lowest probability of triggering subsequent ties among the same two users in other subnetworks: 6.33% (54,142/855,893) for 2-tie sequences and 2.13% (18,207/855,893) for 3-tie sequences. Users' centralities varied across the 4 subnetworks. Among a subset of users enrolled in a randomized trial, those with higher centralities across subnetworks generally had higher abstinence rates, although high centrality in the group discussion subnetwork was not associated with higher abstinence rates.A multirelational approach revealed insights that could not be obtained by analyzing the aggregated network alone, such as the ineffectiveness of group discussions in triggering social ties of other types, the advantage of blogs, message boards, and private messages in leading to subsequent social ties of other types, and the weak connection between one's centrality in the group discussion subnetwork and smoking abstinence. These insights have implications for the design and management of online social networks for smoking cessation.
Project description:Genes act in concert via specific networks to drive various biological processes, including progression of diseases such as cancer. Under different phenotypes, different subsets of the gene members of a network participate in a biological process. Single gene analyses are less effective in identifying such core gene members (subnetworks) within a gene set/network, as compared to gene set/network-based analyses. Hence, it is useful to identify a discriminative classifier by focusing on the subnetworks that correspond to different phenotypes. Here we present a novel algorithm to automatically discover the important subnetworks of closely interacting molecules to differentiate between two phenotypes (context) using gene expression profiles. We name it COSSY (COntext-Specific Subnetwork discoverY). It is a non-greedy algorithm and thus unlikely to have local optima problems. COSSY works for any interaction network regardless of the network topology. One added benefit of COSSY is that it can also be used as a highly accurate classification platform which can produce a set of interpretable features.
Project description:Emerging research demonstrates the potential of protein-protein interaction (PPI) networks in uncovering the mechanistic bases of cancers, through identification of interacting proteins that are coordinately dysregulated in tumorigenic and metastatic samples. When used as features for classification, such coordinately dysregulated subnetworks improve diagnosis and prognosis of cancer considerably over single-gene markers. However, existing methods formulate coordination between multiple genes through additive representation of their expression profiles and utilize fast heuristics to identify dysregulated subnetworks, which may not be well suited to the potentially combinatorial nature of coordinate dysregulation. Here, we propose a combinatorial formulation of coordinate dysregulation and decompose the resulting objective function to cast the problem as one of identifying subnetwork state functions that are indicative of phenotype. Based on this formulation, we show that coordinate dysregulation of larger subnetworks can be bounded using simple statistics on smaller subnetworks. We then use these bounds to devise an efficient algorithm, Crane, that can search the subnetwork space more effectively than existing algorithms. Comprehensive cross-classification experiments show that subnetworks identified by Crane outperform those identified by additive algorithms in predicting metastasis of colorectal cancer (CRC).
Project description:BACKGROUND:Accurate prediction of cancer prognosis based on gene expression data is generally difficult, and identifying robust prognostic markers for cancer remains a challenging problem. Recent studies have shown that modular markers, such as pathway markers and subnetwork markers, can provide better snapshots of the underlying biological mechanisms by incorporating additional biological information, thereby leading to more accurate cancer classification. RESULTS:In this paper, we propose a novel method for simultaneously identifying robust synergistic subnetwork markers that can accurately predict cancer prognosis. The proposed method utilizes an efficient message-passing algorithm called affinity propagation, based on which we identify groups - or subnetworks - of discriminative and synergistic genes, whose protein products are closely located in the protein-protein interaction (PPI) network. Unlike other existing subnetwork marker identification methods, our proposed method can simultaneously identify multiple nonoverlapping subnetwork markers that can synergistically predict cancer prognosis. CONCLUSIONS:Evaluation results based on multiple breast cancer datasets demonstrate that the proposed message-passing approach can identify robust subnetwork markers in the human PPI network, which have higher discriminative power and better reproducibility compared to those identified by previous methods. The identified subnetwork makers can lead to better cancer classifiers with improved overall performance and consistency across independent cancer datasets.
Project description:Fusarium verticillioides is recognized as an important stalk rot pathogen of maize worldwide, but our knowledge of genetic mechanisms underpinning this pathosystem is limited. Previously, we identified a striatin-like protein Fsr1 that plays an important role in stalk rot. To further characterize transcriptome networks downstream of Fsr1, we performed next-generation sequencing (NGS) to investigate relative read abundance and also to infer co-expression networks utilizing the preprocessed expression data through partial correlation. We used a probabilistic pathway activity inference strategy to identify functional subnetwork modules likely involved in virulence. Each subnetwork modules consisted of multiple correlated genes with coordinated expression patterns, but the collective activation levels were significantly different in F. verticillioides wild type versus fsr1 mutant. We also identified putative hub genes from predicted subnetworks for functional validation and network robustness studies through mutagenesis, virulence and qPCR assays. Our results suggest that these genes are important virulence genes that regulate the expression of closely correlated genes, demonstrating that these are important hubs of their respective subnetworks. Lastly, we used key F. verticillioides virulence genes to computationally predict a subnetwork of maize genes that potentially respond to fungal genes by applying cointegration-correlation-expression strategy.
Project description:The extraction of targeted subnetworks is a powerful way to identify functional modules and pathways within complex networks. Here, we present SubNet, a Java-based stand-alone program for extracting subnetworks, given a basal network and a set of selected nodes. Designed with a graphical user-friendly interface, SubNet combines four different extraction methods, which offer the possibility to interrogate a biological network according to the question investigated. Of note, we developed a method based on the highly successful Google PageRank algorithm to extract the subnetwork using the node centrality metric, to which possible node weights of the selected genes can be incorporated.http://www.zdzlab.org/1/subnet.html
Project description:Subnetwork analysis can explore complex patterns of entire molecular pathways for the purpose of drug target identification. In this article, the gene expression profiles of a cohort of patients with breast cancer are integrated with protein-protein interaction (PPI) networks using, simultaneously, both edge scoring and node scoring. A novel optimization algorithm, integrated optimization method to identify deregulated subnetwork (IODNE), is developed to search for the optimal dysregulated subnetwork of the merged gene and protein network. IODNE is applied to select subnetworks for Luminal-A breast cancer from The Cancer Genome Atlas (TCGA) data. A large fraction of cancer-related genes and the well-known clinical targets, ER1/PR and HER2, are found by IODNE. This validates the utility of IODNE. When applying IODNE to the triple-negative breast cancer (TNBC) subtype data, we identified subnetworks that contain genes such as ERBB2, HRAS, PGR, CAD, POLE, and SLC2A1.
Project description:BACKGROUND: Finding reliable gene markers for accurate disease classification is very challenging due to a number of reasons, including the small sample size of typical clinical data, high noise in gene expression measurements, and the heterogeneity across patients. In fact, gene markers identified in independent studies often do not coincide with each other, suggesting that many of the predicted markers may have no biological significance and may be simply artifacts of the analyzed dataset. To find more reliable and reproducible diagnostic markers, several studies proposed to analyze the gene expression data at the level of groups of functionally related genes, such as pathways. Studies have shown that pathway markers tend to be more robust and yield more accurate classification results. One practical problem of the pathway-based approach is the limited coverage of genes by currently known pathways. As a result, potentially important genes that play critical roles in cancer development may be excluded. To overcome this problem, we propose a novel method for identifying reliable subnetwork markers in a human protein-protein interaction (PPI) network. RESULTS: In this method, we overlay the gene expression data with the PPI network and look for the most discriminative linear paths that consist of discriminative genes that are highly correlated to each other. The overlapping linear paths are then optimally combined into subnetworks that can potentially serve as effective diagnostic markers. We tested our method on two independent large-scale breast cancer datasets and compared the effectiveness and reproducibility of the identified subnetwork markers with gene-based and pathway-based markers. We also compared the proposed method with an existing subnetwork-based method. CONCLUSIONS: The proposed method can efficiently find reliable subnetwork markers that outperform the gene-based and pathway-based markers in terms of discriminative power, reproducibility and classification performance. Subnetwork markers found by our method are highly enriched in common GO terms, and they can more accurately classify breast cancer metastasis compared to markers found by a previous method.