Project description:Cyclic nucleotide-binding (CNB) domains allosterically regulate the activity of proteins with diverse functions, but the mechanisms that enable the cyclic nucleotide-binding signal to regulate distant domains are not well understood. Here we use optical tweezers and molecular dynamics to dissect changes in folding energy landscape associated with cAMP-binding signals transduced between the two CNB domains of protein kinase A (PKA). We find that the response of the energy landscape upon cAMP binding is domain specific, resulting in unique but mutually coordinated tasks: one CNB domain initiates cAMP binding and cooperativity, whereas the other triggers inter-domain interactions that promote the active conformation. Inter-domain interactions occur in a stepwise manner, beginning in intermediate-liganded states between apo and cAMP-bound domains. Moreover, we identify a cAMP-responsive switch, the N3A motif, whose conformation and stability depend on cAMP occupancy. This switch serves as a signaling hub, amplifying cAMP-binding signals during PKA activation.
Project description:Structural and biochemical constraints force some segments of proteins to evolve more slowly than others, often allowing identification of conserved structural or sequence motifs that can be associated with substrate binding properties, chemical mechanisms, and molecular functions. We have assessed the functional and structural constraints imposed by cofactors on the evolution of new functions in a superfamily of flavoproteins characterized by two-dinucleotide binding domains, the "two dinucleotide binding domains" flavoproteins (tDBDF) superfamily. Although these enzymes catalyze many different types of oxidation/reduction reactions, each is initiated by a stereospecific hydride transfer reaction between two cofactors, a pyridine nucleotide and flavin adenine dinucleotide (FAD). Sequence and structural analysis of more than 1,600 members of the superfamily reveals new members and identifies details of the evolutionary connections among them. Our analysis shows that in all of the highly divergent families within the superfamily, these cofactors adopt a conserved configuration optimal for stereospecific hydride transfer that is stabilized by specific interactions with amino acids from several motifs distributed among both dinucleotide binding domains. The conservation of cofactor configuration in the active site restricts the pyridine nucleotide to interact with FAD from the re-side, limiting the flow of electrons from the re-side to the si-side. This directionality of electron flow constrains interactions with the different partner proteins of different families to occur on the same face of the cofactor binding domains. As a result, superimposing the structures of tDBDFs aligns not only these interacting proteins, but also their constituent electron acceptors, including heme and iron-sulfur clusters. Thus, not only are specific aspects of the cofactor-directed chemical mechanism conserved across the superfamily, the constraints they impose are manifested in the mode of protein-protein interactions. Overlaid on this foundation of conserved interactions, nature has conscripted different protein partners to serve as electron acceptors, thereby generating diversification of function across the superfamily.
Project description:MotivationProtein synthesis is a non-equilibrium process, meaning that the speed of translation can influence the ability of proteins to fold and function. Assuming that structurally similar proteins fold by similar pathways, the profile of translation speed along an mRNA should be evolutionarily conserved between related proteins to direct correct folding and downstream function. The only evidence to date for such conservation of translation speed between homologous proteins has used codon rarity as a proxy for translation speed. There are, however, many other factors including mRNA structure and the chemistry of the amino acids in the A- and P-sites of the ribosome that influence the speed of amino acid addition.ResultsRibosome profiling experiments provide a signal directly proportional to the underlying translation times at the level of individual codons. We compared ribosome occupancy profiles (extracted from five different large-scale yeast ribosome profiling studies) between related protein domains to more directly test if their translation schedule was conserved. Our analysis reveals that the ribosome occupancy profiles of paralogous domains tend to be significantly more similar to one another than to profiles of non-paralogous domains. This trend does not depend on domain length, structural classes, amino acid composition or sequence similarity. Our results indicate that entire ribosome occupancy profiles and not just rare codon locations are conserved between even distantly related domains in yeast, providing support for the hypothesis that translation schedule is conserved between structurally related domains to retain folding pathways and facilitate efficient folding.Availability and implementationPython3 code is available on GitHub at https://github.com/DanNissley/Compare-ribosome-occupancy.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Disordered regions of proteins often bind to structured domains, mediating interactions within and between proteins. However, it is difficult to identify a priori the short disordered regions involved in binding. We set out to determine if docking such peptide regions to peptide binding domains would assist in these predictions.We assembled a redundancy reduced dataset of SLiM (Short Linear Motif) containing proteins from the ELM database. We selected 84 sequences which had an associated PDB structures showing the SLiM bound to a protein receptor, where the SLiM was found within a 50 residue region of the protein sequence which was predicted to be disordered. First, we investigated the Vina docking scores of overlapping tripeptides from the 50 residue SLiM containing disordered regions of the protein sequence to the corresponding PDB domain. We found only weak discrimination of docking scores between peptides involved in binding and adjacent non-binding peptides in this context (AUC 0.58).Next, we trained a bidirectional recurrent neural network (BRNN) using as input the protein sequence, predicted secondary structure, Vina docking score and predicted disorder score. The results were very promising (AUC 0.72) showing that multiple sources of information can be combined to produce results which are clearly superior to any single source.We conclude that the Vina docking score alone has only modest power to define the location of a peptide within a larger protein region known to contain it. However, combining this information with other knowledge (using machine learning methods) clearly improves the identification of peptide binding regions within a protein sequence. This approach combining docking with machine learning is primarily a predictor of binding to peptide-binding sites, and is not intended as a predictor of specificity of binding to particular receptors.
Project description:High-resolution crystal structures of the DNA duplex sequence d(CGCGAATTCGCG)(2) complexed with three minor-groove ligands are reported. A highly conserved cluster of 11 linked water molecules has been found in the native and all 3 ligand-bound structures, positioned at the boundary of the A/T and G/C regions where the minor groove widens. This cluster appears to play a key structural role in stabilizing noncovalently binding small molecules in the AT region of the B-DNA minor groove. The cluster extends from the backbone phosphate groups along the mouth of the groove and links to DNA and ligands by a network of hydrogen bonds that help to maintain the ligands in position. This arrangement of water molecules is distinct from, but linked by, hydrogen bonding to the well-established spine of hydration, which is displaced by bound ligands. Features of the water cluster and observed differences in binding modes help to explain the measured binding affinities and thermodynamic characteristics of these ligands on binding to AT sites in DNA.
Project description:BackgroundLarge-scale bioactivity/SAR Open Data has recently become available, and this has allowed new analyses and approaches to be developed to help address the productivity and translational gaps of current drug discovery. One of the current limitations of these data is the relative sparsity of reported interactions per protein target, and complexities in establishing clear relationships between bioactivity and targets using bioinformatics tools. We detail in this paper the indexing of targets by the structural domains that bind (or are likely to bind) the ligand within a full-length protein. Specifically, we present a simple heuristic to map small molecule binding to Pfam domains. This profiling can be applied to all proteins within a genome to give some indications of the potential pharmacological modulation and regulation of all proteins.ResultsIn this implementation of our heuristic, ligand binding to protein targets from the ChEMBL database was mapped to structural domains as defined by profiles contained within the Pfam-A database. Our mapping suggests that the majority of assay targets within the current version of the ChEMBL database bind ligands through a small number of highly prevalent domains, and conversely the majority of Pfam domains sampled by our data play no currently established role in ligand binding. Validation studies, carried out firstly against Uniprot entries with expert binding-site annotation and secondly against entries in the wwPDB repository of crystallographic protein structures, demonstrate that our simple heuristic maps ligand binding to the correct domain in about 90 percent of all assessed cases. Using the mappings obtained with our heuristic, we have assembled ligand sets associated with each Pfam domain.ConclusionsSmall molecule binding has been mapped to Pfam-A domains of protein targets in the ChEMBL bioactivity database. The result of this mapping is an enriched annotation of small molecule bioactivity data and a grouping of activity classes following the Pfam-A specifications of protein domains. This is valuable for data-focused approaches in drug discovery, for example when extrapolating potential targets of a small molecule with known activity against one or few targets, or in the assessment of a potential target for drug discovery or screening studies.
Project description:Alphaviruses and flaviviruses have class II fusion glycoproteins that are essential for virion assembly and infectivity. Importantly, the tip of domain II is structurally conserved between the alphavirus and flavivirus fusion proteins, yet whether these structural similarities between virus families translate to functional similarities is unclear. Using in vivo evolution of Zika virus (ZIKV), we identified several novel emerging variants, including an envelope glycoprotein variant in β-strand c (V114M) of domain II. We have previously shown that the analogous β-strand c and the ij loop, located in the tip of domain II of the alphavirus E1 glycoprotein, are important for infectivity. This led us to hypothesize that flavivirus E β-strand c also contributes to flavivirus infection. We generated this ZIKV glycoprotein variant and found that while it had little impact on infection in mosquitoes, it reduced replication in human cells and mice and increased virus sensitivity to ammonium chloride, as seen for alphaviruses. In light of these results and given our alphavirus ij loop studies, we mutated a conserved alanine at the tip of the flavivirus ij loop to valine to test its effect on ZIKV infectivity. Interestingly, this mutation inhibited infectious virion production of ZIKV and yellow fever virus, but not West Nile virus. Together, these studies show that shared domains of the alphavirus and flavivirus class II fusion glycoproteins harbor structurally analogous residues that are functionally important and contribute to virus infection in vivo. IMPORTANCE Arboviruses are a significant global public health threat, yet there are no antivirals targeting these viruses. This problem is in part due to our lack of knowledge of the molecular mechanisms involved in the arbovirus life cycle. In particular, virus entry and assembly are essential processes in the virus life cycle and steps that can be targeted for the development of antiviral therapies. Therefore, understanding common, fundamental mechanisms used by different arboviruses for entry and assembly is essential. In this study, we show that flavivirus and alphavirus residues located in structurally conserved and analogous regions of the class II fusion proteins contribute to common mechanisms of entry, dissemination, and infectious-virion production. These studies highlight how class II fusion proteins function and provide novel targets for development of antivirals.
Project description:Lipopolysaccharide is a major glycolipid component in the outer leaflet of the outer membrane (OM), a peculiar permeability barrier of Gram-negative bacteria that prevents many toxic compounds from entering the cell. Lipopolysaccharide transport (Lpt) across the periplasmic space and its assembly at the Escherichia coli cell surface are carried out by a transenvelope complex of seven essential Lpt proteins spanning the inner membrane (LptBCFG), the periplasm (LptA), and the OM (LptDE), which appears to operate as a unique machinery. LptC is an essential inner membrane-anchored protein with a large periplasm-protruding domain. LptC binds the inner membrane LptBFG ABC transporter and interacts with the periplasmic protein LptA. However, its role in lipopolysaccharide transport is unclear. Here we show that LptC lacking the transmembrane region is viable and can bind the LptBFG inner membrane complex; thus, the essential LptC functions are located in the periplasmic domain. In addition, we characterize two previously described inactive single mutations at two conserved glycines (G56V and G153R, respectively) of the LptC periplasmic domain, showing that neither mutant is able to assemble the transenvelope machinery. However, while LptCG56V failed to copurify any Lpt component, LptCG153R was able to interact with the inner membrane protein complex LptBFG. Overall, our data further support the model whereby the bridge connecting the inner and outer membranes would be based on the conserved structurally homologous jellyroll domain shared by five out of the seven Lpt components.
Project description:BackgroundThe Drosophila gene embryonic lethal abnormal visual system (elav) is the prototype of a gene family present in all metazoans. Its members encode structurally conserved neuronal proteins with three RNA Recognition Motifs (RRM) but they paradoxically act at diverse levels of post-transcriptional regulation. In an attempt to understand the history of this family, we searched for orthologs in eleven completely sequenced genomes, including those of humans, D. melanogaster and C. elegans, for which cDNAs are available.ResultsWe analyzed 23 orthologs/paralogs of elav, and found evidence of gain/loss of gene copy number. For one set of genes, including elav itself, the coding sequences are free of introns and their products most resemble ELAV. The remaining genes show remarkable conservation of their exon organization, and their products most resemble FNE and RBP9, proteins encoded by the two elav paralogs of Drosophila. Remarkably, three of the conserved exon junctions are both close to structural elements, involved respectively in protein-RNA interactions and in the regulation of sub-cellular localization, and in the vicinity of diverse sequence variations.ConclusionThe data indicate that the essential elav gene of Drosophila is newly emerged, restricted to dipterans and of retrotransposed origin. We propose that the conserved exon junctions constitute potential sites for sequence/function modifications, and that RRM binding proteins, whose function relies upon plastic RNA-protein interactions, may have played an important role in brain evolution.