Consistent dissection of the protein interaction network by combining global and local metrics.
ABSTRACT: We propose a new network decomposition method to systematically identify protein interaction modules in the protein interaction network. Our method incorporates both a global metric and a local metric for balance and consistency. We have compared the performance of our method with several earlier approaches on both simulated and real datasets using different criteria, and show that our method is more robust to network alterations and more effective at discovering functional protein modules.
Project description:The molecular profiles exhibited in different cancer types are very different; hence, discovering distinct functional modules associated with specific cancer types is very important to understand the distinct functions associated with them. Protein-protein interaction networks carry vital information about molecular interactions in cellular systems, and identification of functional modules (subgraphs) in these networks is one of the most important applications of biological network analysis.In this study, we developed a new graph theory based method to identify distinct functional modules from nine different cancer protein-protein interaction networks. The method is composed of three major steps: (i) extracting modules from protein-protein interaction networks using network clustering algorithms; (ii) identifying distinct subgraphs from the derived modules; and (iii) identifying distinct subgraph patterns from distinct subgraphs. The subgraph patterns were evaluated using experimentally determined cancer-specific protein-protein interaction data from the Ingenuity knowledgebase, to identify distinct functional modules that are specific to each cancer type.We identified cancer-type specific subgraph patterns that may represent the functional modules involved in the molecular pathogenesis of different cancer types. Our method can serve as an effective tool to discover cancer-type specific functional modules from large protein-protein interaction networks.
Project description:Molecular modeling frequently constructs classification models for the prediction of two-class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. Here, we systematically consider metric value surface generation as a consequence of data balance, and propose the computation of an inverse cumulative distribution function taken over a metric surface. The proposed distribution analysis can aid in the selection of metrics when formulating study design. In addition to theoretical analyses, a practical example in chemogenomic virtual screening highlights the care required in metric selection and interpretation.
Project description:<h4>Background</h4>Since genes involved in the same biological modules usually present correlated expression profiles, lots of computational methods have been proposed to identify gene functional modules based on the expression profiles data. Recently, Sparse Singular Value Decomposition (SSVD) method has been proposed to bicluster gene expression data to identify gene modules. However, this model can only handle the gene expression data where no gene interaction information is integrated. Ignoring the prior gene interaction information may produce the identified gene modules hard to be biologically interpreted.<h4>Results</h4>In this paper, we develop a Sparse Network-regularized SVD (SNSVD) method that integrates a prior gene interaction network from a protein protein interaction network and gene expression data to identify underlying gene functional modules. The results on a set of simulated data show that SNSVD is more effective than the traditional SVD-based methods. The further experiment results on real cancer genomic data show that most co-expressed modules are not only significantly enriched on GO/KEGG pathways, but also correspond to dense sub-networks in the prior gene interaction network. Besides, we also use our method to identify ten differentially co-expressed miRNA-gene modules by integrating matched miRNA and mRNA expression data of breast cancer from The Cancer Genome Atlas (TCGA). Several important breast cancer related miRNA-gene modules are discovered.<h4>Conclusions</h4>All the results demonstrate that SNSVD can overcome the drawbacks of SSVD and capture more biologically relevant functional modules by incorporating a prior gene interaction network. These identified functional modules may provide a new perspective to understand the diagnostics, occurrence and progression of cancer.
Project description:(1) Background: Psoriasis is a multifactorial chronic inflammatory disorder of the skin, with significant morbidity, characterized by hyperproliferation of the epidermis. Even though psoriasis' etiology is not fully understood, it is believed to be multifactorial, with numerous key components. (2) Methods: In order to cast light on the complex molecular interactions in psoriasis vulgaris at both protein-protein interactions and transcriptomics levels, we studied a set of microarray gene expression analyses consisting of 170 paired lesional and non-lesional samples. Afterwards, a network analysis was conducted on the protein-protein interaction network of differentially expressed genes based on micro- and macro-level network metrics at a systemic level standpoint. (3) Results: We found 17 top communicative genes, all of which were experimentally proven to be pivotal in psoriasis, which were identified in two modules, namely the cell cycle and immune system. Intra- and inter-gene interaction subnetworks from the top communicative genes might provide further insight into the corresponding characteristic interactions. (4) Conclusions: Potential gene combinations for therapeutic/diagnostics purposes were identified. Moreover, our proposed workflow could be of interest to a broader range of future biological network analysis studies.
Project description:Global transcript expression experiments are commonly used to investigate the biological processes that underlie complex traits. These studies can exhibit complex patterns of pleiotropy when trans-acting genetic factors influence overlapping sets of multiple transcripts. Dissecting these patterns into biological modules with distinct genetic etiology can provide models of how genetic variants affect specific processes that contribute to a trait. Here we identify transcript modules associated with pleiotropic genetic factors and apply genetic interaction analysis to disentangle the regulatory architecture in a mouse intercross study of kidney function. The method, called the combined analysis of pleiotropy and epistasis (CAPE), has been previously used to model genetic networks for multiple physiological traits. It simultaneously models multiple phenotypes to identify direct genetic influences as well as influences mediated through genetic interactions. We first identify candidate trans expression quantitative trait loci (eQTL) and the transcripts potentially affected. We then clustered the transcripts into modules of co-expressed genes, from which we compute summary module phenotypes. Finally, we applied CAPE to map the network of interacting module QTL (modQTL) affecting the gene modules. The resulting network mapped how multiple modQTL both directly and indirectly affect modules associated with metabolic functions and biosynthetic processes. This work demonstrates how the integration of pleiotropic signals in gene expression data can be used to infer a complex hypothesis of how multiple loci interact to co-regulate transcription programs, thereby providing additional constraints to prioritize validation experiments.
Project description:Since organism development and many critical cell biology processes are organized in modular patterns, many algorithms have been proposed to detect modules. In this study, a new method, MOfinder, was developed to detect overlapping modules in a protein-protein interaction (PPI) network. We demonstrate that our method is more accurate than other 5 methods. Then, we applied MOfinder to yeast and human PPI network and explored the overlapping information. Using the overlapping modules of human PPI network, we constructed the module-module communication network. Functional annotation showed that the immune-related and cancer-related proteins were always together and present in the same modules, which offer some clues for immune therapy for cancer. Our study around overlapping modules suggests a new perspective on the analysis of PPI network and improves our understanding of disease.
Project description:Current work in elucidating relationships between diseases has largely been based on pre-existing knowledge of disease genes. Consequently, these studies are limited in their discovery of new and unknown disease relationships. We present the first quantitative framework to compare and contrast diseases by an integrated analysis of disease-related mRNA expression data and the human protein interaction network. We identified 4,620 functional modules in the human protein network and provided a quantitative metric to record their responses in 54 diseases leading to 138 significant similarities between diseases. Fourteen of the significant disease correlations also shared common drugs, supporting the hypothesis that similar diseases can be treated by the same drugs, allowing us to make predictions for new uses of existing drugs. Finally, we also identified 59 modules that were dysregulated in at least half of the diseases, representing a common disease-state "signature". These modules were significantly enriched for genes that are known to be drug targets. Interestingly, drugs known to target these genes/proteins are already known to treat significantly more diseases than drugs targeting other genes/proteins, highlighting the importance of these core modules as prime therapeutic opportunities.
Project description:Recently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related functional modules. To integrate CNA and GE data, we first built a gene-gene relationship network from a set of seed genes by enumerating all types of pairwise correlations, e.g. GE-GE, CNA-GE, and CNA-CNA, over multiple patients. Next, we propose a voting-based cancer module identification algorithm by combining topological and data-driven properties (VToD algorithm) by using the gene-gene relationship network as a source of data-driven information, and the PPI data as topological information. We applied the VToD algorithm to 266 glioblastoma multiforme (GBM) and 96 ovarian carcinoma (OVC) samples that have both expression and copy number measurements, and identified 22 GBM modules and 23 OVC modules. Among 22 GBM modules, 15, 12, and 20 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Among 23 OVC modules, 19, 18, and 23 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Similarly, we also observed that 9 and 2 GBM modules and 15 and 18 OVC modules were enriched with cancer gene census (CGC) and specific cancer driver genes, respectively. Our proposed module-detection algorithm significantly outperformed other existing methods in terms of both functional and cancer gene set enrichments. Most of the cancer-related pathways from both cancer data sets found in our algorithm contained more than two types of gene-gene relationships, showing strong positive correlations between the number of different types of relationship and CGC enrichment [Formula: see text]-values (0.64 for GBM and 0.49 for OVC). This study suggests that identified modules containing both expression changes and CNAs can explain cancer-related activities with greater insights.
Project description:<h4>Background</h4>Signalling pathways relay information by transmitting signals from cell surface receptors to intracellular effectors that eventually activate the transcription of target genes. Since signalling pathways involve several types of molecular interactions including protein-protein interactions, we postulated that investigating their organization in the context of the global protein-protein interaction network could provide a new integrated view of signalling mechanisms.<h4>Results</h4>Using a graph-theory based method to analyse the fly protein-protein interaction network, we found that each signalling pathway is organized in two to three different signalling modules. These modules contain canonical proteins of the signalling pathways, known regulators as well as other proteins thereby predicted to participate to the signalling mechanisms. Connections between the signalling modules are prominent as compared to the other network's modules and interactions within and between signalling modules are among the more central routes of the interaction network.<h4>Conclusion</h4>Altogether, these modules form an interactome sub-network devoted to signalling with particular topological properties: modularity, density and centrality. This finding reflects the integration of the signalling system into cell functioning and its important role connecting and coordinating different biological processes at the level of the interactome.
Project description:Drug repurposing/repositioning, which aims to find novel indications for existing drugs, contributes to reducing the time and cost for drug development. For the recent decade, gene expression profiles of drug stimulating samples have been successfully used in drug repurposing. However, most of the existing methods neglect the gene modules and the interactions among the modules, although the cross-talks among pathways are common in drug response. It is essential to develop a method that utilizes the cross-talks information to predict the reliable candidate associations. In this study, we developed MNBDR (Module Network Based Drug Repositioning), a novel method that based on module network to screen drugs. It integrated protein-protein interactions and gene expression profile of human, to predict drug candidates for diseases. Specifically, the MNBDR mined dense modules through protein-protein interaction (PPI) network and constructed a module network to reveal cross-talks among modules. Then, together with the module network, based on existing gene expression data set of drug stimulation samples and disease samples, we used random walk algorithms to capture essential modules in disease development and proposed a new indicator to screen potential drugs for a given disease. Results showed MNBDR could provide better performance than popular methods. Moreover, functional analysis of the essential modules in the network indicated our method could reveal biological mechanism in drug response.