SCOWLP update: 3D classification of protein-protein, -peptide, -saccharide and -nucleic acid interactions, and structure-based binding inferences across folds.
ABSTRACT: BACKGROUND: Protein interactions are essential for coordinating cellular functions. Proteomic studies have already elucidated a huge amount of protein-protein interactions that require detailed functional analysis. Understanding the structural basis of each individual interaction through their structural determination is necessary, yet an unfeasible task. Therefore, computational tools able to predict protein binding regions and recognition modes are required to rationalize putative molecular functions for proteins. With this aim, we previously created SCOWLP, a structural classification of protein binding regions at protein family level, based on the information obtained from high-resolution 3D protein-protein and protein-peptide complexes. DESCRIPTION: We present here a new version of SCOWLP that has been enhanced by the inclusion of protein-nucleic acid and protein-saccharide interactions. SCOWLP takes interfacial solvent into account for a detailed characterization of protein interactions. In addition, the binding regions obtained per protein family have been enriched by the inclusion of predicted binding regions, which have been inferred from structurally related proteins across all existing folds. These inferences might become very useful to suggest novel recognition regions and compare structurally similar interfaces from different families. CONCLUSIONS: The updated SCOWLP has new functionalities that allow both, detection and comparison of protein regions recognizing different types of ligands, which include other proteins, peptides, nucleic acids and saccharides, within a solvated environment. Currently, SCOWLP allows the analysis of predicted protein binding regions based on structure-based inferences across fold space. These predictions may have a unique potential in assisting protein docking, in providing insights into protein interaction networks, and in guiding rational engineering of protein ligands. The newly designed SCOWLP web application has an improved user-friendly interface that facilitates its usage, and is available at http://www.scowlp.org.
Project description:The protein frustratometer is an energy landscape theory-inspired algorithm that aims at localizing and quantifying the energetic frustration present in protein molecules. Frustration is a useful concept for analyzing proteins' biological behavior. It compares the energy distributions of the native state with respect to structural decoys. The network of minimally frustrated interactions encompasses the folding core of the molecule. Sites of high local frustration often correlate with functional regions such as binding sites and regions involved in allosteric transitions. We present here an upgraded version of a webserver that measures local frustration. The new implementation that allows the inclusion of electrostatic energy terms, important to the interactions with nucleic acids, is significantly faster than the previous version enabling the analysis of large macromolecular complexes within a user-friendly interface. The webserver is freely available at URL: http://frustratometer.qb.fcen.uba.ar.
Project description:In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. However, these advances in understanding RBPs are limited mainly to eukaryotic species and we only have limited tools to faithfully predict RNA-binders in bacteria. Here, we describe a support vector machine-based method, called TriPepSVM, for the prediction of RNA-binding proteins. TriPepSVM applies string kernels to directly handle protein sequences using tri-peptide frequencies. Testing the method in human and bacteria, we find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs. TriPepSVM outperforms existing applications, which consider classical structural features of RNA-binding or homology, in the task of RBP prediction in both human and bacteria. Finally, we predict 66 novel RBPs in Salmonella Typhimurium and validate the bacterial proteins ClpX, DnaJ and UbiG to associate with RNA in vivo.
Project description:PixelDB, the Peptide Exosite Location Database, compiles 1966 non-redundant, high-resolution structures of protein-peptide complexes filtered to minimize the impact of crystal packing on peptide conformation. The database is organized to facilitate study of structurally conserved versus non-conserved elements of protein-peptide engagement. PixelDB clusters complexes based on the structural similarity of the peptide-binding protein, and by comparing complexes within a cluster highlights examples of domains that engage peptides using more than one binding mode. PixelDB also identifies conserved peptide core structural motifs characteristic of each binding mode. Peptide regions that flank core motifs often make non-structurally conserved interactions with the protein surface in regions we call exosites. Many examples establish that exosite contacts can be important for enhancing protein binding and interaction specificity. PixelDB provides a resource for computational and structural biologists to study, model, and predict core-motif and exosite-contacting peptide interactions. PixelDB is available to the community without restriction in a convenient flat-file format with accompanying visualization tools.
Project description:We have created an Amino Acid-Nucleotide Interaction Database (AANT; http://aant.icmb.utexas. edu/) that categorizes all amino acid-nucleotide interactions from experimentally determined protein-nucleic acid structures, and provides users with a graphic interface for visualizing these interactions in aggregate. AANT accomplishes this by extracting individual amino acid-nucleotide interactions from structures in the Protein Data Bank, combining and superimposing these interactions into multiple structure files (e.g. 20 amino acids x 5 nucleotides) and grouping structurally similar interactions into more readily identifiable clusters. Using the Chime web browser plug-in, users can view 3D representations of the superimpositions and clusters. The unique collection and representation of data on amino acid-nucleotide interactions facilitates understanding the specificity of protein-nucleic acid interactions at a more fundamental level, and allows comparison of otherwise extremely disparate sets of structures. Moreover, by modularly representing the fundamental interactions that govern binding specificity it may prove possible to better engineer nucleic acid binding proteins.
Project description:M-ORBIS is a Molecular Cartography approach that performs integrative high-throughput analysis of structural data to localize all types of binding sites and associated partners by homology and to characterize their properties and behaviors in a systemic way. The robustness of our binding site inferences was compared to four curated datasets corresponding to protein heterodimers and homodimers and protein-DNA/RNA assemblies. The Molecular Cartographies of structurally well-detailed proteins shows that 44% of their surfaces interact with non-solvent partners. Residue contact frequencies with water suggest that ?86% of their surfaces are transiently solvated, whereas only 15% are specifically solvated. Our analysis also reveals the existence of two major binding site families: specific binding sites which can only bind one type of molecule (protein, DNA, RNA, etc.) and polyvalent binding sites that can bind several distinct types of molecule. Specific homodimer binding sites are for instance nearly twice as hydrophobic than previously described and more closely resemble the protein core, while polyvalent binding sites able to form homo and heterodimers more closely resemble the surfaces involved in crystal packing. Similarly, the regions able to bind DNA and to alternatively form homodimers, are more hydrophobic and less polar than previously described DNA binding sites.
Project description:Thermodynamic analysis of urea-biopolymer interactions and effects of urea on folding of proteins and alpha-helical peptides shows that urea interacts primarily with polar amide surface. Urea is therefore predicted to be a quantitative probe of coupled folding, remodeling, and other large-scale changes in the amount of water-accessible polar amide surface in protein processes. A parallel analysis indicates that glycine betaine [N,N,N-trimethylglycine (GB)] can be used to detect burial or exposure of anionic (carboxylate, phosphate) biopolymer surface. To test these predictions, we have investigated the effects of these solutes (0-3 m) on the formation of 1:1 complexes between lac repressor (LacI) and its symmetric operator site (SymL) at a constant KCl molality. Urea reduces the binding constant K(TO) [initial slope dlnK(TO)/dm(urea) = -1.7 +/- 0.2], and GB increases K(TO) [initial slope dlnK(TO)/dm(GB) = 2.1 +/- 0.2]. For both solutes, this derivative decreases with an increase in solute concentration. Analysis of these initial slopes predicts that (1.5 +/- 0.3) x 10(3) A2 of polar amide surface and (4.5 +/- 1.0) x 10(2) A2 of anionic surface are buried in the association process. Analysis of published structural data, together with modeling of unfolded regions of free LacI as extended chains, indicates that 1.5 x 10(3) A2 of polar amide surface and 6.3 x 10(2) A2 of anionic surface are buried in complexation. Quantitative agreement between structural and thermodynamic results is obtained for amide surface (urea); for anionic surface (GB), the experimental value is approximately 70% of the structural value. For LacI-SymL binding, two-thirds of the structurally predicted change in amide surface (1.0 x 10(3) A2) occurs outside the protein-DNA interface in protein-protein interfaces formed by folding of the hinge helices and interactions of the DNA binding domain (DBD) with the core of the repressor. Since urea interacts principally with amide surface, it is particularly well-suited to detect and quantify the extent of coupled folding and other large-scale remodeling events in the steps of protein-nucleic acid interactions and other protein associations.
Project description:Glycosaminoglycans (GAGs) are very complex, natural anionic polysaccharides. They are polymers of repeating disaccharide units of uronic acid and hexosamine residues. Owing to their template-free, spatiotemporally-controlled, and enzyme-mediated biosyntheses, GAGs possess enormous polydispersity, heterogeneity, and structural diversity which often translate into multiple biological roles. It is well documented that GAGs contribute to physiological and pathological processes by binding to proteins including serine proteases, serpins, chemokines, growth factors, and microbial proteins. Despite advances in the GAG field, the GAG-protein interface remains largely unexploited by drug discovery programs. Thus, Non-Saccharide Glycosaminoglycan Mimetics (NSGMs) have been rationally developed as a novel class of sulfated molecules that modulate GAG-protein interface to promote various biological outcomes of substantial benefit to human health. In this review, we describe the chemical, biochemical, and pharmacological aspects of recently reported NSGMs and highlight their therapeutic potentials as structurally and mechanistically novel anti-coagulants, anti-cancer agents, anti-emphysema agents, and anti-viral agents. We also describe the challenges that complicate their advancement and describe ongoing efforts to overcome these challenges with the aim of advancing the novel platform of NSGMs to clinical use.
Project description:We report herein the synthesis and DNA/RNA binding properties of bPNA+, a new variant of bifacial peptide nucleic acid (bPNA) that binds oligo T/U nucleic acids to form triplex hybrids. By virtue of a new bivalent side chain on bPNA+, similar DNA affinity and hybrid thermostability can be obtained with half the molecular footprint of previously reported bPNA. Lysine derivatives bearing two melamine bases (K2M) can be prepared on multigram scale by double reductive alkylation with melamine acetaldehyde, resulting in a tertiary amine side chain that affords both peptide solubility and selective base-triple formation with 4 T/U bases; the Fmoc-K2M derivative can be used directly in solid phase peptide synthesis, rendering bPNA+ conveniently accessible. A compact bPNA+binding site of two U6 domains can be genetically encoded to replace existing 6 bp stem elements at virtually any location within an RNA transcript. We thus replaced internal 6 bp RNA stems that supported loop regions with 6 base-triple hybrid stems using fluorophore-labeled bPNA+. As the loop regions engaged in RNA tertiary interactions, the labeled hybrid stems provided a fluorescent readout; bPNA+ enabled this readout without covalent chemical modification or introduction of new structural elements. This strategy was demonstrated to be effective for reporting on widely observed RNA tertiary interactions such as intermolecular RNA-RNA kissing loop dimerization, RNA-protein binding, and intramolecular RNA tetraloop-tetraloop receptor binding, illustrating the potential general utility of this method. The modest 6 bp stem binding footprint of bPNA+ makes the hybrid stem replacement method practical for noncovalent installation of synthetic probes of RNA interactions. We anticipate that bPNA+ structural probes will be useful for the study of tertiary interactions in long noncoding RNAs.
Project description:Invading pathogens elicit potent immune responses in cells through interactions between structurally conserved molecules derived from the pathogens and specialized innate immune receptors such as the Toll-like receptors (TLRs). Nucleic acid is one of the principal TLR ligands. Nucleic acid-sensing TLRs recognize an array of nucleic acids, including double-stranded RNA, single-stranded RNA, and DNAs with specific sequence motifs. Although ligand-induced dimerization is commonly observed followed by TLR activation, both the specific recognition mechanisms and the ligand-receptor interactions vary among different TLRs. In this review, we highlight our current understanding of how these receptors recognize their cognate ligands based on the recent advances in structural biology.
Project description:The OB-fold domain is a compact structural motif frequently used for nucleic acid recognition. Structural comparison of all OB-fold/nucleic acid complexes solved to date confirms the low degree of sequence similarity among members of this family while highlighting several structural sequence determinants common to most of these OB-folds. Loops connecting the secondary structural elements in the OB-fold core are extremely variable in length and in functional detail. However, certain features of ligand binding are conserved among OB-fold complexes, including the location of the binding surface, the polarity of the nucleic acid with respect to the OB-fold, and particular nucleic acid-protein interactions commonly used for recognition of single-stranded and unusually structured nucleic acids. Intriguingly, the observation of shared nucleic acid polarity may shed light on the longstanding question concerning OB-fold origins, indicating that it is unlikely that members of this family arose via convergent evolution.