A conserved mammalian protein interaction network.
ABSTRACT: Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.
Project description:Viruses interact with hundreds to thousands of proteins in mammals, yet adaptation against viruses has only been studied in a few proteins specialized in antiviral defense. Whether adaptation to viruses typically involves only specialized antiviral proteins or affects a broad array of virus-interacting proteins is unknown. Here, we analyze adaptation in ~1300 virus-interacting proteins manually curated from a set of 9900 proteins conserved in all sequenced mammalian genomes. We show that viruses (i) use the more evolutionarily constrained proteins within the cellular functions they interact with and that (ii) despite this high constraint, virus-interacting proteins account for a high proportion of all protein adaptation in humans and other mammals. Adaptation is elevated in virus-interacting proteins across all functional categories, including both immune and non-immune functions. We conservatively estimate that viruses have driven close to 30% of all adaptive amino acid changes in the part of the human proteome conserved within mammals. Our results suggest that viruses are one of the most dominant drivers of evolutionary change across mammalian and human proteomes.
Project description:Interactions between sperm and egg proteins can occur physically between gamete surface-binding proteins, and genetically between gamete proteins that work in complementary pathways in which they may not physically interact. Physically interacting sperm-egg proteins have been functionally identified in only a few species, and none have been verified within mammals. Candidate genes on both the sperm and egg surfaces exist, but gene deletion studies do not support functional interactions between these sperm-egg proteins; interacting sperm-egg proteins thus remain elusive. Cooperative gamete proteins undergo rapid evolution, and it is predicted that these sperm-egg proteins will also have correlated evolutionary rates due to compensatory changes on both the sperm and egg. To explore potential physical and genetic interactions in sperm-egg proteins, we sequenced four candidate genes from diverse primate species, and used regression and likelihood methods to test for signatures of coevolution between sperm-egg gene pairs. With both methods, we found that the egg protein CD9 coevolves with the sperm protein IZUMO1, suggesting a physical or genetic interaction occurs between them. With regression analysis, we found that CD9 and CRISP2 have correlated rates of evolution, and with likelihood analysis, that CD9 and CRISP1 have correlated rates. This suggests that the different tests may reflect different levels of interaction, be it physical or genetic. Coevolution tests thus provide an exploratory method for detecting potentially interacting sperm-egg protein pairs.
Project description:In this study, we investigated on a systems level how complex protein interactions underlying cell polarity in yeast determine the dynamic association of proteins with the polar cortical domain (PCD) where they localize and perform morphogenetic functions. We constructed a network of physical interactions among >100 proteins localized to the PCD. This network was further divided into five robust modules correlating with distinct subprocesses associated with cell polarity. Based on this reconstructed network, we proposed a simple model that approximates a PCD protein's molecular residence time as the sum of the characteristic time constants of the functional modules with which it interacts, weighted by the number of edges forming these interactions. Regression analyses showed excellent fitting of the model with experimentally measured residence times for a large subset of the PCD proteins. The model is able to predict residence times using small training sets. Our analysis also revealed a scaffold protein that imposes a local constraint of dynamics for certain interacting proteins.
Project description:The capacity of proteins to interact specifically with one another underlies our conceptual understanding of how living systems function. Systems-level study of specificity in protein-protein interactions is complicated by the fact that the cellular environment is crowded and heterogeneous; interaction pairs may exist at low relative concentrations and thus be presented with many more opportunities for promiscuous interactions compared with specific interaction possibilities. Here we address these questions by using a simple computational model that includes specifically designed interacting model proteins immersed in a mixture containing hundreds of different unrelated ones; all of them undergo simulated diffusion and interaction. We find that specific complexes are quite robust to interference from promiscuous interaction partners only in the range of temperatures T(design) > T > T(rand). At T > T(design), specific complexes become unstable, whereas at T < T(rand), formation of specific complexes is suppressed by promiscuous interactions. Specific interactions can form only if T(design) > T(rand). This condition requires an energy gap between binding energy in a specific complex and set of binding energies between randomly associating proteins, providing a general physical constraint on evolutionary selection or design of specific interacting protein interfaces. This work has implications for our understanding of how the protein repertoire functions and evolves within the context of cellular systems.
Project description:Protein--protein interactions are ubiquitous and essential for most biological processes. Although new proteomic technologies have generated large catalogs of interacting proteins, considerably less is known about these interactions at the molecular level, information that would aid in predicting protein interactions, designing therapeutics to alter these interactions, and understanding the effects of disease-producing mutations. Here we describe mapping the interacting surfaces of the bacterial toxin SPN (Streptococcus pyogenes NAD(+) hydrolase) in complex with its antitoxin IFS (immunity factor for SPN) by using hydrogen-deuterium amide exchange and electrospray ionization mass spectrometry. This approach affords data in a relatively short time for small amounts of protein, typically 5-7 pmol per analysis. The results show a good correspondence with a recently determined crystal structure of the IFS--SPN complex but additionally provide strong evidence for a folding transition of the IFS protein that accompanies its binding to SPN. The outcome shows that mass-based chemical footprinting of protein interaction surfaces can provide information about protein dynamics that is not easily obtained by other methods and can potentially be applied to large, multiprotein complexes that are out of range for most solution-based methods of biophysical analysis.
Project description:Experimental high-throughput studies of protein-protein interactions are beginning to provide enough data for comprehensive computational studies. Today, about ten large data sets, each with thousands of interacting pairs, coarsely sample the interactions in fly, human, worm, and yeast. Another about 55,000 pairs of interacting proteins have been identified by more careful, detailed biochemical experiments. Most interactions are experimentally observed in prokaryotes and simple eukaryotes; very few interactions are observed in higher eukaryotes such as mammals. It is commonly assumed that pathways in mammals can be inferred through homology to model organisms, e.g. the experimental observation that two yeast proteins interact is transferred to infer that the two corresponding proteins in human also interact. Two pairs for which the interaction is conserved are often described as interologs. The goal of this investigation was a large-scale comprehensive analysis of such inferences, i.e. of the evolutionary conservation of interologs. Here, we introduced a novel score for measuring the overlap between protein-protein interaction data sets. This measure appeared to reflect the overall quality of the data and was the basis for our two surprising results from our large-scale analysis. Firstly, homology-based inferences of physical protein-protein interactions appeared far less successful than expected. In fact, such inferences were accurate only for extremely high levels of sequence similarity. Secondly, and most surprisingly, the identification of interacting partners through sequence similarity was significantly more reliable for protein pairs within the same organism than for pairs between species. Our analysis underlined that the discrepancies between different datasets are large, even when using the same type of experiment on the same organism. This reality considerably constrains the power of homology-based transfer of interactions. In particular, the experimental probing of interactions in distant model organisms has to be undertaken with some caution. More comprehensive images of protein-protein networks will require the combination of many high-throughput methods, including in silico inferences and predictions. http://www.rostlab.org/results/2006/ppi_homology/
Project description:Levels of selective constraint vary among proteins. Although strong constraint on a protein is often attributed to its functional importance, evolutionary rate may also be limited if a protein is fragile, such that a large proportion of amino acid replacements reduce its fitness. To determine the relative contributions of essentiality and fragility to selective constraint, we compared relationships of selection against nonsense mutations (snon) and selection against missense mutations (smis) to protein sequence conservation (Ka). As expected, snon is greater than smis; however, the correlation between smis and Ka is nearly three times stronger than the correlation between snon and Ka. Moreover, examination of relationships to gene expression level, tissue specificity, and number of protein-protein interactions shows that smis is more strongly correlated than snon to all three measures of biological function. Thus, our analysis reveals that slowly evolving proteins are under strong selective constraint primarily because they are fragile, and that this association likely exists because allowing a protein to function improperly, rather than removing it from a biological network, can negatively affect the functions of other molecules it interacts with and their downstream products.
Project description:Deep catalogs of genetic variation from thousands of humans enable the detection of intraspecies constraint by identifying coding regions with a scarcity of variation. While existing techniques summarize constraint for entire genes, single gene-wide metrics conceal regional constraint variability within each gene. Therefore, we have created a detailed map of constrained coding regions (CCRs) by leveraging variation observed among 123,136 humans from the Genome Aggregation Database. The most constrained CCRs are enriched for pathogenic variants in ClinVar and mutations underlying developmental disorders. CCRs highlight protein domain families under high constraint and suggest unannotated or incomplete protein domains. The highest-percentile CCRs complement existing variant prioritization methods when evaluating de novo mutations in studies of autosomal dominant disease. Finally, we identify highly constrained CCRs within genes lacking known disease associations. This observation suggests that CCRs may identify regions under strong purifying selection that, when mutated, cause severe developmental phenotypes or embryonic lethality.
Project description:Genes carry out their biological functions through pathways in complex networks consisting of many interacting molecules. Studies on the effect of network architecture on the evolution of individual proteins will provide valuable information for understanding the origin and evolution as well as functional conservation of signaling pathways. However, the relationship between the network architecture and the individual protein sequence evolution is yet little known. In current study, we carried out network-level molecular evolution analysis on TLR (Toll-like receptor ) signaling pathway, which plays an important role in innate immunity in insects and mammals, and we found that: 1) The selection constraint of genes was negatively correlated with its position along TLR signaling pathway; 2) all genes in TLR signaling pathway were highly conserved and underwent strong purifying selection; 3) the distribution of selective pressure along the pathway was driven by differential nonsynonymous substitution levels; 4) The TLR signaling pathway might present in a common ancestor of sponges and eumetazoa, and evolve via the TLR, IKK, I?B and NF-?B genes underwent duplication events as well as adaptor molecular enlargement, and gene structure and conservation motif of NF-?B genes shifted in their evolutionary history. Our results will improve our understanding on the evolutionary history of animal TLR signaling pathway as well as the relationship between the network architecture and the sequences evolution of individual protein.
Project description:Interactions of transcriptional activators are difficult to study using transcription-based two-hybrid assays due to potent activation resulting in false positives. Here we report the development of the Golgi two-hybrid (G2H), a method that interrogates protein interactions within the Golgi, where transcriptional activators can be assayed with negligible background. The G2H relies on cell surface glycosylation to report extracellularly on protein-protein interactions occurring within the secretory pathway. In the G2H, protein pairs are fused to modular domains of the reporter glycosyltransferase, Och1p, and proper cell wall formation due to Och1p activity is observed only when a pair of proteins interacts. Cells containing interacting protein pairs are identified by selectable phenotypes associated with Och1p activity and proper cell wall formation: cells that have interacting proteins grow under selective conditions and display weak wheat germ agglutinin (WGA) binding by flow cytometry, whereas cells that lack interacting proteins display stunted growth and strong WGA binding. Using this assay, we detected the interaction between transcription factor MyoD and its binding partner Id2. Interfering mutations along the MyoD:Id2 interaction interface ablated signal in the G2H assay. Furthermore, we used the G2H to detect interactions of the activation domain of Gal4p with a variety of binding partners. Finally, selective conditions were used to enrich for cells encoding interacting partners. The G2H detects protein-protein interactions that cannot be identified via traditional two-hybrid methods and should be broadly useful for probing previously inaccessible subsets of the interactome, including transcriptional activators and proteins that traffic through the secretory pathway.