Template-based structure modeling of protein-protein interactions.
ABSTRACT: The structure of protein-protein complexes can be constructed by using the known structure of other protein complexes as a template. The complex structure templates are generally detected either by homology-based sequence alignments or, given the structure of monomer components, by structure-based comparisons. Critical improvements have been made in recent years by utilizing interface recognition and by recombining monomer and complex template libraries. Encouraging progress has also been witnessed in genome-wide applications of template-based modeling, with modeling accuracy comparable to high-throughput experimental data. Nevertheless, bottlenecks exist due to the incompleteness of the protein-protein complex structure library and the lack of methods for distant homologous template identification and full-length complex structure refinement.
Project description:The total number of protein-protein complex structures currently available in the Protein Data Bank (PDB) is six times smaller than the total number of tertiary structures in the PDB, which limits the power of homology-based approaches to complex structure modeling. We present a threading-recombination approach, COTH, to boost the protein complex structure library by combining tertiary structure templates with complex alignments. The query sequences are first aligned to complex templates using a modified dynamic programming algorithm, guided by ab initio binding-site predictions. The monomer alignments are then shifted to the multimeric template framework by structural alignments. COTH was tested on 500 nonhomologous dimeric proteins, which can successfully detect correct templates for 50% of the cases after homologous templates are excluded, which significantly outperforms conventional homology modeling algorithms. It also shows a higher accuracy in interface modeling than rigid-body docking of unbound structures from ZDOCK although with lower coverage. These data demonstrate new avenues to model complex structures from nonhomologous templates.
Project description:Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes.
Project description:MOTIVATION:Template-based and template-free methods have both been widely used in predicting the structures of protein-protein complexes. Template-based modeling is effective when a reliable template is available, while template-free methods are required for predicting the binding modes or interfaces that have not been previously observed. Our goal is to combine the two methods to improve computational protein-protein complex structure prediction. RESULTS:Here, we present a method to identify and combine high-confidence predictions of a template-based method (SPRING) with a template-free method (ZDOCK). Cross-validated using the protein-protein docking benchmark version 5.0, our method (ZING) achieved a success rate of 68.2%, outperforming SPRING and ZDOCK, with success rates of 52.1% and 35.9% respectively, when the top 10 predictions were considered per test case. In conclusion, a statistics-based method that evaluates and integrates predictions from template-based and template-free methods is more successful than either method independently. AVAILABILITY AND IMPLEMENTATION:ZING is available for download as a Github repository (https://github.com/weng-lab/ZING.git). SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
Project description:Origins of life hypotheses often invoke a transitional phase of nonenzymatic template-directed RNA replication prior to the emergence of ribozyme-catalyzed copying of genetic information. Here, using NMR and ITC, we interrogate the binding affinity of guanosine 5'-monophosphate (GMP) for primer-template complexes when either another GMP, or a helper oligonucleotide, can bind downstream. Binding of GMP to a primer-template complex cannot be significantly enhanced by the possibility of downstream monomer binding, because the affinity of the downstream monomer is weaker than that of the first monomer. Strikingly, GMP binding affinity can be enhanced by ca. 2 orders of magnitude when a helper oligonucleotide is stably bound downstream of the monomer binding site. We compare these thermodynamic parameters to those previously reported for T7 RNA polymerase-mediated replication to help address questions of binding affinity in related nonenzymatic processes.
Project description:Small-angle x-ray scattering (SAXS) is able to extract low-resolution protein shape information without requiring a specific crystal formation. However, it has found little use in atomic-level protein structure determination due to the uncertainty of residue-level structural assignment. We developed a new algorithm, SAXSTER, to couple the raw SAXS data with protein-fold-recognition algorithms and thus improve template-based protein-structure predictions. We designed nine different matching scoring functions of template and experimental SAXS profiles. The logarithm of the integrated correlation score showed the best template recognition ability and had the highest correlation with the true template modeling (TM)-score of the target structures. We tested the method in large-scale protein-fold-recognition experiments and achieved significant improvements in prioritizing the best template structures. When SAXSTER was applied to the proteins of asymmetric SAXS profile distributions, the average TM-score of the top-ranking templates increased by 18% after homologous templates were excluded, which corresponds to a p-value < 10(-9) in Student's t-test. These data demonstrate a promising use of SAXS data to facilitate computational protein structure modeling, which is expected to work most efficiently for proteins of irregular global shape and/or multiple-domain protein complexes.
Project description:The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD <2.5 Å by SPRING is 134% and 167% higher than these competing methods. SPRING is controlled with ZDOCK on 77 docking benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.
Project description:Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, and localization) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains-biological process, molecular function, and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound, and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy.
Project description:Multiple protein templates are commonly used in manual protein structure prediction. However, few automated algorithms of selecting and combining multiple templates are available.Here we develop an effective multi-template combination algorithm for protein comparative modeling. The algorithm selects templates according to the similarity significance of the alignments between template and target proteins. It combines the whole template-target alignments whose similarity significance score is close to that of the top template-target alignment within a threshold, whereas it only takes alignment fragments from a less similar template-target alignment that align with a sizable uncovered region of the target. We compare the algorithm with the traditional method of using a single top template on the 45 comparative modeling targets (i.e. easy template-based modeling targets) used in the seventh edition of Critical Assessment of Techniques for Protein Structure Prediction (CASP7). The multi-template combination algorithm improves the GDT-TS scores of predicted models by 6.8% on average. The statistical analysis shows that the improvement is significant (p-value < 10-4). Compared with the ideal approach that always uses the best template, the multi-template approach yields only slightly better performance. During the CASP7 experiment, the preliminary implementation of the multi-template combination algorithm (FOLDpro) was ranked second among 67 servers in the category of high-accuracy structure prediction in terms of GDT-TS measure.We have developed a novel multi-template algorithm to improve protein comparative modeling.
Project description:Many high-resolution crystal structures have contributed to our understanding of the reaction pathway for catalysis by DNA and RNA polymerases, but the structural basis of nonenzymatic template-directed RNA replication has not been studied in comparable detail. Here we present crystallographic studies of the binding of ribonucleotide monomers to RNA primer-template complexes, with the goal of improving our understanding of the mechanism of nonenzymatic RNA copying, and of catalysis by polymerases. To explore how activated ribonucleotides recognize and bind to RNA templates, we synthesized an unreactive phosphonate-linked pyrazole analogue of guanosine 5'-phosphoro-2-methylimidazolide (2-MeImpG), a highly activated nucleotide that has been used extensively to study nonenzymatic primer extension. We cocrystallized this analogue with structurally rigidified RNA primer-template complexes carrying single or multiple monomer binding sites, and obtained high-resolution X-ray structures of these complexes. In addition to Watson-Crick base pairing, we repeatedly observed noncanonical guanine:cytidine base pairs in our crystal structures. In most structures, the phosphate and leaving group moieties of the monomers were highly disordered, while in others the distance from O3' of the primer to the phosphorus of the incoming monomer was too great to allow for reaction. We suggest that these effects significantly influence the rate and fidelity of nonenzymatic RNA replication, and that even primitive ribozyme polymerases could enhance RNA replication by enforcing Watson-Crick base pairing between monomers and primer-template complexes, and by bringing the reactive functional groups into closer proximity.
Project description:We developed and tested RAPTOR++ in CASP8 for protein structure prediction. RAPTOR++ contains four modules: threading, model quality assessment, multiple protein alignment, and template-free modeling. RAPTOR++ first threads a target protein to all the templates using three methods and then predicts the quality of the 3D model implied by each alignment using a model quality assessment method. Based upon the predicted quality, RAPTOR++ employs different strategies as follows. If multiple alignments have good quality, RAPTOR++ builds a multiple protein alignment between the target and top templates and then generates a 3D model using MODELLER. If all the alignments have very low quality, RAPTOR++ uses template-free modeling. Otherwise, RAPTOR++ submits a threading-generated 3D model with the best quality. RAPTOR++ was not ready for the first 1/3 targets and was under development during the whole CASP8 season. The template-based and template-free modeling modules in RAPTOR++ are not closely integrated. We are using our template-free modeling technique to refine template-based models.