Project description:A genomic catalogue of protein-protein interactions is a rich source of information, particularly for exploring the relationships between proteins. Numerous systems-wide and small-scale experiments have been conducted to identify interactions; however, our knowledge of all interactions for any one species is incomplete, and alternative means to expand these network maps is needed. We therefore took a comparative biology approach to predict protein-protein interactions across five species (human, mouse, fly, worm, and yeast) and developed InterologFinder for research biologists to easily navigate this data. We also developed a confidence score for interactions based on available experimental evidence and conservation across species.The connectivity of the resultant networks was determined to have scale-free distribution, small-world properties, and increased local modularity, indicating that the added interactions do not disrupt our current understanding of protein network structures. We show examples of how these improved interactomes can be used to analyze a genome-scale dataset (RNAi screen) and to assign new function to proteins. Predicted interactions within this dataset were tested by co-immunoprecipitation, resulting in a high rate of validation, suggesting the high quality of networks produced.Protein-protein interactions were predicted in five species, based on orthology. An InteroScore, a score accounting for homology, number of orthologues with evidence of interactions, and number of unique observations of interactions, is given to each known and predicted interaction. Our website http://www.interologfinder.org provides research biologists intuitive access to this data.
Project description:The hippocampal subfield CA3 is thought to function as an auto-associative network that stores experiences as memories. Information from these experiences arrives directly from the entorhinal cortex as well as indirectly through the dentate gyrus, which performs sparsification and decorrelation. The computational purpose for these dual input pathways has not been firmly established. We model CA3 as a Hopfield-like network that stores both dense, correlated encodings and sparse, decorrelated encodings. As more memories are stored, the former merge along shared features while the latter remain distinct. We verify our model's prediction in rat CA3 place cells, which exhibit more distinct tuning during theta phases with sparser activity. Finally, we find that neural networks trained in multitask learning benefit from a loss term that promotes both correlated and decorrelated representations. Thus, the complementary encodings we have found in CA3 can provide broad computational advantages for solving complex tasks.
Project description:Developing effective treatment strategies for neurodegenerative diseases require an understanding of the underlying cellular pathways that lead to neuronal vulnerability and progressive degeneration. To date, numerous mutations in 147 distinct genes are identified to be "associated" with, "modifier" or "causative" of amyotrophic lateral sclerosis (ALS). Protein products of these genes and their interactions helped determine the protein landscape of ALS, and revealed upstream modulators, key canonical pathways, interactome domains and novel therapeutic targets. Our analysis originates from known human mutations and circles back to human, revealing increased PPARG and PPARGC1A expression in the Betz cells of sALS patients and patients with TDP43 pathology, and emphasizes the importance of lipid homeostasis. Downregulation of YWHAZ, a 14-3-3 protein, and cytoplasmic accumulation of ZFYVE27 especially in diseased Betz cells of ALS patients reinforce the idea that perturbed protein communications, interactome defects, and altered converging pathways will reveal novel therapeutic targets in ALS.
Project description:Genome-wide association studies of schizophrenia (GWAS) have revealed the role of rare and common genetic variants, but the functional effects of the risk variants remain to be understood. Protein interactome-based studies can facilitate the study of molecular mechanisms by which the risk genes relate to schizophrenia (SZ) genesis, but protein-protein interactions (PPIs) are unknown for many of the liability genes. We developed a computational model to discover PPIs, which is found to be highly accurate according to computational evaluations and experimental validations of selected PPIs. We present here, 365 novel PPIs of liability genes identified by the SZ Working Group of the Psychiatric Genomics Consortium (PGC). Seventeen genes that had no previously known interactions have 57 novel interactions by our method. Among the new interactors are 19 drug targets that are targeted by 130 drugs. In addition, we computed 147 novel PPIs of 25 candidate genes investigated in the pre-GWAS era. While there is little overlap between the GWAS genes and the pre-GWAS genes, the interactomes reveal that they largely belong to the same pathways, thus reconciling the apparent disparities between the GWAS and prior gene association studies. The interactome including 504 novel PPIs overall, could motivate other systems biology studies and trials with repurposed drugs. The PPIs are made available on a webserver, called Schizo-Pi at http://severus.dbmi.pitt.edu/schizo-pi with advanced search capabilities.
Project description:While machine learning (ML) models have been able to achieve unprecedented accuracies across various prediction tasks in quantum chemistry, it is now apparent that accuracy on a test set alone is not a guarantee for robust chemical modeling such as stable molecular dynamics (MD). To go beyond accuracy, we use explainable artificial intelligence (XAI) techniques to develop a general analysis framework for atomic interactions and apply it to the SchNet and PaiNN neural network models. We compare these interactions with a set of fundamental chemical principles to understand how well the models have learned the underlying physicochemical concepts from the data. We focus on the strength of the interactions for different atomic species, how predictions for intensive and extensive quantum molecular properties are made, and analyze the decay and many-body nature of the interactions with interatomic distance. Models that deviate too far from known physical principles produce unstable MD trajectories, even when they have very high energy and force prediction accuracy. We also suggest further improvements to the ML architectures to better account for the polynomial decay of atomic interactions.
Project description:Understanding the molecular mechanisms associated with disease is a central goal of modern medical research. As such, many thousands of experiments have been published that detail individual molecular events that contribute to a disease. Here we use a semi-automated text mining approach to accurately and exhaustively curate the primary literature for chronic pain states. In so doing, we create a comprehensive network of 1,002 contextualized protein-protein interactions (PPIs) specifically associated with pain. The PPIs form a highly interconnected and coherent structure, and the resulting network provides an alternative to those derived from connecting genes associated with pain using interactions that have not been shown to occur in a painful state. We exploit the contextual data associated with our interactions to analyse subnetworks specific to inflammatory and neuropathic pain, and to various anatomical regions. Here, we identify potential targets for further study and several drug-repurposing opportunities. Finally, the network provides a framework for the interpretation of new data within the field of pain.
Project description:BackgroundMalignant peritoneal mesothelioma (MPeM) is an aggressive cancer affecting the abdominal peritoneal lining and intra-abdominal organs, with a median survival of ~2.5 years.MethodsWe constructed the protein interactome of 59 MPeM-associated genes with previously known protein-protein interactions (PPIs) as well as novel PPIs predicted using our previously developed HiPPIP computational model and analysed it for transcriptomic and functional associations and for repurposable drugs.ResultsThe MPeM interactome had over 400 computationally predicted PPIs and 4700 known PPIs. Transcriptomic evidence validated 75.6% of the genes in the interactome and 65% of the novel interactors. Some genes had tissue-specific expression in extramedullary hematopoietic sites and the expression of some genes could be correlated with unfavourable prognoses in various cancers. 39 out of 152 drugs that target the proteins in the interactome were identified as potentially repurposable for MPeM, with 29 having evidence from prior clinical trials, animal models or cell lines for effectiveness against peritoneal and pleural mesothelioma and primary peritoneal cancer. Functional modules related to chromosomal segregation, transcriptional dysregulation, IL-6 production and hematopoiesis were identified from the interactome. The MPeM interactome overlapped significantly with the malignant pleural mesothelioma interactome, revealing shared molecular pathways.ConclusionsOur findings demonstrate the utility of the interactome in uncovering biological associations and in generating clinically translatable results.
Project description:Malignant pleural mesothelioma (MPM) is an aggressive cancer affecting the outer lining of the lung, with a median survival of less than one year. We constructed an 'MPM interactome' with over 300 computationally predicted protein-protein interactions (PPIs) and over 2400 known PPIs of 62 literature-curated genes whose activity affects MPM. Known PPIs of the 62 MPM associated genes were derived from Biological General Repository for Interaction Datasets (BioGRID) and Human Protein Reference Database (HPRD). Novel PPIs were predicted by applying the HiPPIP algorithm, which computes features of protein pairs such as cellular localization, molecular function, biological process membership, genomic location of the gene, and gene expression in microarray experiments, and classifies the pairwise features as interacting or non-interacting based on a random forest model. We validated five novel predicted PPIs experimentally. The interactome is significantly enriched with genes differentially ex-pressed in MPM tumors compared with normal pleura and with other thoracic tumors, genes whose high expression has been correlated with unfavorable prognosis in lung cancer, genes differentially expressed on crocidolite exposure, and exosome-derived proteins identified from malignant mesothelioma cell lines. 28 of the interactors of MPM proteins are targets of 147 U.S. Food and Drug Administration (FDA)-approved drugs. By comparing disease-associated versus drug-induced differential expression profiles, we identified five potentially repurposable drugs, namely cabazitaxel, primaquine, pyrimethamine, trimethoprim and gliclazide. Preclinical studies may be con-ducted in vitro to validate these computational results. Interactome analysis of disease-associated genes is a powerful approach with high translational impact. It shows how MPM-associated genes identified by various high throughput studies are functionally linked, leading to clinically translatable results such as repurposed drugs. The PPIs are made available on a webserver with interactive user interface, visualization and advanced search capabilities.
Project description:Protein microarrays enable investigation of diverse biochemical properties for thousands of proteins in a single experiment, an unparalleled capacity. Using a high-density system called HaloTag nucleic acid programmable protein array (HaloTag-NAPPA), we created high-density protein arrays comprising 12,000 Arabidopsis ORFs. We used these arrays to query protein-protein interactions for a set of 38 transcription factors and transcriptional regulators (TFs) that function in diverse plant hormone regulatory pathways. The resulting transcription factor interactome network, TF-NAPPA, contains thousands of novel interactions. Validation in a benchmarked in vitro pull-down assay revealed that a random subset of TF-NAPPA validated at the same rate of 64% as a positive reference set of literature-curated interactions. Moreover, using a bimolecular fluorescence complementation (BiFC) assay, we confirmed in planta several interactions of biological interest and determined the interaction localizations for seven pairs. The application of HaloTag-NAPPA technology to plant hormone signaling pathways allowed the identification of many novel transcription factor-protein interactions and led to the development of a proteome-wide plant hormone TF interactome network.
Project description:BackgroundThe analysis and usage of biological data is hindered by the spread of information across multiple repositories and the difficulties posed by different nomenclature systems and storage formats. In particular, there is an important need for data unification in the study and use of protein-protein interactions. Without good integration strategies, it is difficult to analyze the whole set of available data and its properties.ResultsWe introduce BIANA (Biologic Interactions and Network Analysis), a tool for biological information integration and network management. BIANA is a Python framework designed to achieve two major goals: i) the integration of multiple sources of biological information, including biological entities and their relationships, and ii) the management of biological information as a network where entities are nodes and relationships are edges. Moreover, BIANA uses properties of proteins and genes to infer latent biomolecular relationships by transferring edges to entities sharing similar properties. BIANA is also provided as a plugin for Cytoscape, which allows users to visualize and interactively manage the data. A web interface to BIANA providing basic functionalities is also available. The software can be downloaded under GNU GPL license from http://sbi.imim.es/web/BIANA.php.ConclusionsBIANA's approach to data unification solves many of the nomenclature issues common to systems dealing with biological data. BIANA can easily be extended to handle new specific data repositories and new specific data types. The unification protocol allows BIANA to be a flexible tool suitable for different user requirements: non-expert users can use a suggested unification protocol while expert users can define their own specific unification rules.