Iterative Molecular Dynamics-Rosetta Protein Structure Refinement Protocol to Improve Model Quality.
Ontology highlight
ABSTRACT: Rosetta is one of the prime tools for high resolution protein structure refinement. While its scoring function can distinguish native-like from non-native-like conformations in many cases, the method is limited by conformational sampling for larger proteins, that is, leaving a local energy minimum in which the search algorithm may get stuck. Here, we test the hypothesis that iteration of Rosetta with an orthogonal sampling and scoring strategy might facilitate exploration of conformational space. Specifically, we run short molecular dynamics (MD) simulations on models created by de novo folding of large proteins into cryoEM density maps to enable sampling of conformational space not directly accessible to Rosetta and thus provide an escape route from the conformational traps. We present a combined MD-Rosetta protein structure refinement protocol that can overcome some of these sampling limitations. Two of four benchmark proteins showed incremental improvement through all three rounds of the iterative refinement protocol. Molecular dynamics is most efficient in applying subtle but important rearrangements within secondary structure elements and is thus highly complementary to the Rosetta refinement, which focuses on side chains and loop regions.
Project description:Many excellent methods exist that incorporate cryo-electron microscopy (cryoEM) data to constrain computational protein structure prediction and refinement. Previously, it was shown that iteration of two such orthogonal sampling and scoring methods – Rosetta and molecular dynamics (MD) simulations – facilitated exploration of conformational space in principle. Here, we go beyond a proof-of-concept study and address significant remaining limitations of the iterative MD–Rosetta protein structure refinement protocol. Specifically, all parts of the iterative refinement protocol are now guided by medium-resolution cryoEM density maps, and previous knowledge about the native structure of the protein is no longer necessary. Models are identified solely based on score or simulation time. All four benchmark proteins showed substantial improvement through three rounds of the iterative refinement protocol. The best-scoring final models of two proteins had sub-Ångstrom RMSD to the native structure over residues in secondary structure elements. Molecular dynamics was most efficient in refining secondary structure elements and was thus highly complementary to the Rosetta refinement which is most powerful in refining side chains and loop regions.
Project description:Knowing atomistic details of proteins is essential not only for the understanding of protein function but also for the development of drugs. Experimental methods such as X-ray crystallography, NMR, and cryo-electron microscopy (cryo-EM) are the preferred forms of protein structure determination and have achieved great success over the most recent decades. Computational methods may be an alternative when experimental techniques fail. However, computational methods are severely limited when it comes to predicting larger macromolecule structures with little sequence similarity to known structures. The incorporation of experimental restraints in computational methods is becoming increasingly important to more reliably predict protein structure. One such experimental input used in structure prediction and refinement is cryo-EM densities. Recent advances in cryo-EM have arguably revolutionized the field of structural biology. Our previously developed cryo-EM-guided Rosetta-MD protocol has shown great promise in the refinement of soluble protein structures. In this study, we extended cryo-EM density-guided iterative Rosetta-MD to membrane proteins. We also improved the methodology in general by picking models based on a combination of their score and fit-to-density during the Rosetta model selection. By doing so, we have been able to pick models superior to those with the previous selection based on Rosetta score only and we have been able to further improve our previously refined models of soluble proteins. The method was tested with five membrane spanning protein structures. By applying density-guided Rosetta-MD iteratively we were able to refine the predicted structures of these membrane proteins to atomic resolutions. We also showed that the resolution of the density maps determines the improvement and quality of the refined models. By incorporating high-resolution density maps (∼4 Å), we were able to more significantly improve the quality of the models than when medium-resolution maps (6.9 Å) were used. Beginning from an average starting structure root mean square deviation (RMSD) to native of 4.66 Å, our protocol was able to refine the structures to bring the average refined structure RMSD to 1.66 Å when 4 Å density maps were used. The protocol also successfully refined the HIV-1 CTD guided by an experimental 5 Å density map.
Project description:A method for the local refinement of protein structures that targets improvements in local stereochemistry while preserving the overall fold is presented. The method uses force field-based minimization and sampling via molecular dynamics simulations with a modified force field to bring bonds, angles, and torsion angles into an acceptable range for high-resolution protein structures. The method is implemented in the locPREFMD web server and was tested on computational models submitted to CASP11. Using MolProbity scores as the main assessment criterion, the locPREFMD method significantly improves the stereochemical quality of given input models close to the quality expected for experimental structures while maintaining the Cα coordinates of the initial model.
Project description:Refinement is the last step in protein structure prediction pipelines to convert approximate homology models to experimental accuracy. Protocols based on molecular dynamics (MD) simulations have shown promise, but current methods are limited to moderate levels of consistent refinement. To explore the energy landscape between homology models and native structures and analyze the challenges of MD-based refinement, eight test cases were studied via extensive simulations followed by Markov state modeling. In all cases, native states were found very close to the experimental structures and at the lowest free energies, but refinement was hindered by a rough energy landscape. Transitions from the homology model to the native states require the crossing of significant kinetic barriers on at least microsecond time scales. A significant energetic driving force toward the native state was lacking until its immediate vicinity, and there was significant sampling of off-pathway states competing for productive refinement. The role of recent force field improvements is discussed and transition paths are analyzed in detail to inform which key transitions have to be overcome to achieve successful refinement.
Project description:A molecular dynamics (MD) simulation based protocol for structure refinement of template-based model predictions is described. The protocol involves the application of restraints, ensemble averaging of selected subsets, interpolation between initial and refined structures, and assessment of refinement success. It is found that sub-microsecond MD-based sampling when combined with ensemble averaging can produce moderate but consistent refinement for most systems in the CASP targets considered here.
Project description:NMR structure calculation using NOE-derived distance restraints requires a considerable number of assignments of both backbone and sidechains resonances, often difficult or impossible to get for large or complex proteins. Pseudocontact shifts (PCSs) also play a well-established role in NMR protein structure calculation, usually to augment existing structural, mostly NOE-derived, information. Existing refinement protocols using PCSs usually either require a sizeable number of sidechain assignments or are complemented by other experimental restraints. Here, we present an automated iterative procedure to perform backbone protein structure refinements requiring only a limited amount of backbone amide PCSs. Already known structural features from a starting homology model, in this case modules of repeat proteins, are framed into a scaffold that is subsequently refined by experimental PCSs. The method produces reliable indicators that can be monitored to judge about the performance. We applied it to a system in which sidechain assignments are hardly possible, designed Armadillo repeat proteins (dArmRPs), and we calculated the solution NMR structure of YM4A, a dArmRP containing four sequence-identical internal modules, obtaining high convergence to a single structure. We suggest that this approach is particularly useful when approximate folds are known from other techniques, such as X-ray crystallography, while avoiding inherent artefacts due to, for instance, crystal packing.
Project description:SummaryRefinement of protein structure models is a long-standing problem in structural bioinformatics. Molecular dynamics-based methods have emerged as an avenue to achieve consistent refinement. The PREFMD web server implements an optimized protocol based on the method successfully tested in CASP11. Validation with recent CASP refinement targets shows consistent and more significant improvement in global structure accuracy over other state-of-the-art servers.Availability and implementationPREFMD is freely available as a web server at http://feiglab.org/prefmd. Scripts for running PREFMD as a stand-alone package are available at https://github.com/feiglab/prefmd.git.Contactfeig@msu.edu.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:One of critical difficulties of molecular dynamics (MD) simulations in protein structure refinement is that the physics-based energy landscape lacks a middle-range funnel to guide nonnative conformations toward near-native states. We propose to use the target model as a probe to identify fragmental analogs from PDB. The distance maps are then used to reshape the MD energy funnel. The protocol was tested on 181 benchmarking and 26 CASP targets. It was found that structure models of correct folds with TM-score >0.5 can be often pulled closer to native with higher GDT-HA score, but improvement for the models of incorrect folds (TM-score <0.5) are much less pronounced. These data indicate that template-based fragmental distance maps essentially reshaped the MD energy landscape from golf-course-like to funnel-like ones in the successfully refined targets with a radius of TM-score ?0.5. These results demonstrate a new avenue to improve high-resolution structures by combining knowledge-based template information with physics-based MD simulations.
Project description:The structure of human protein HSPC034 has been determined by both solution nuclear magnetic resonance (NMR) spectroscopy and X-ray crystallography. Refinement of the NMR structure ensemble, using a Rosetta protocol in the absence of NMR restraints, resulted in significant improvements not only in structure quality, but also in molecular replacement (MR) performance with the raw X-ray diffraction data using MOLREP and Phaser. This method has recently been shown to be generally applicable with improved MR performance demonstrated for eight NMR structures refined using Rosetta (Qian et al., Nature 2007;450:259-264). Additionally, NMR structures of HSPC034 calculated by standard methods that include NMR restraints have improvements in the RMSD to the crystal structure and MR performance in the order DYANA, CYANA, XPLOR-NIH, and CNS with explicit water refinement (CNSw). Further Rosetta refinement of the CNSw structures, perhaps due to more thorough conformational sampling and/or a superior force field, was capable of finding alternative low energy protein conformations that were equally consistent with the NMR data according to the Recall, Precision, and F-measure (RPF) scores. On further examination, the additional MR-performance shortfall for NMR refined structures as compared with the X-ray structure were attributed, in part, to crystal-packing effects, real structural differences, and inferior hydrogen bonding in the NMR structures. A good correlation between a decrease in the number of buried unsatisfied hydrogen-bond donors and improved MR performance demonstrates the importance of hydrogen-bond terms in the force field for improving NMR structures. The superior hydrogen-bond network in Rosetta-refined structures demonstrates that correct identification of hydrogen bonds should be a critical goal of NMR structure refinement. Inclusion of nonbivalent hydrogen bonds identified from Rosetta structures as additional restraints in the structure calculation results in NMR structures with improved MR performance.
Project description:Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. These methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore the conformational space more broadly. Based on the insights of this analysis, we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here.