Free energy determinants of tertiary structure and the evaluation of protein models.
ABSTRACT: We develop a protocol for estimating the free energy difference between different conformations of the same polypeptide chain. The conformational free energy evaluation combines the CHARMM force field with a continuum treatment of the solvent. In almost all cases studied, experimentally determined structures are predicted to be more stable than misfolded "decoys." This is due in part to the fact that the Coulomb energy of the native protein is consistently lower than that of the decoys. The solvation free energy generally favors the decoys, although the total electrostatic free energy (sum of Coulomb and solvation terms) favors the native structure. The behavior of the solvation free energy is somewhat counterintuitive and, surprisingly, is not correlated with differences in the burial of polar area between native structures and decoys. Rather. the effect is due to a more favorable charge distribution in the native protein, which, as is discussed, will tend to decrease its interaction with the solvent. Our results thus suggest, in keeping with a number of recent studies, that electrostatic interactions may play an important role in determining the native topology of a folded protein. On this basis, a simplified scoring function is derived that combines a Coulomb term with a hydrophobic contact term. This function performs as well as the more complete free energy evaluation in distinguishing the native structure from misfolded decoys. Its computational efficiency suggests that it can be used in protein structure prediction applications, and that it provides a physically well-defined alternative to statistically derived scoring functions.
Project description:A scoring protocol based on implicit membrane-based scoring functions and a new protocol for optimizing the positioning of proteins inside the membrane was evaluated for its capacity to discriminate native-like states from misfolded decoys. A decoy set previously established by the Baker lab (Proteins: Struct., Funct., Genet. 2006, 62, 1010-1025) was used along with a second set that was generated to cover higher resolution models. The Implicit Membrane Model 1 (IMM1), IMM1 model with CHARMM 36 parameters (IMM1-p36), generalized Born with simple switching (GBSW), and heterogeneous dielectric generalized Born versions 2 (HDGBv2) and 3 (HDGBv3) were tested along with the new HDGB van der Waals (HDGBvdW) model that adds implicit van der Waals contributions to the solvation free energy. For comparison, scores were also calculated with the distance-scaled finite ideal-gas reference (DFIRE) scoring function. Z-scores for native state discrimination, energy vs root-mean-square deviation (RMSD) correlations, and the ability to select the most native-like structures as top-scoring decoys were evaluated to assess the performance of the scoring functions. Ranking of the decoys in the Baker set that were relatively far from the native state was challenging and dominated largely by packing interactions that were captured best by DFIRE with less benefit of the implicit membrane-based models. Accounting for the membrane environment was much more important in the second decoy set where especially the HDGB-based scoring functions performed very well in ranking decoys and providing significant correlations between scores and RMSD, which shows promise for improving membrane protein structure prediction and refinement applications. The new membrane structure scoring protocol was implemented in the MEMScore web server ( http://feiglab.org/memscore ).
Project description:Deciphering the structural determinants of protein-protein interactions (PPIs) is essential to gain a deep understanding of many important biological functions in the living cells. Computational approaches for the structural modeling of PPIs, such as protein-protein docking, are quite needed to complement existing experimental techniques. The reliability of a protein-protein docking method is dependent on the ability of the scoring function to accurately distinguish the near-native binding structures from a huge number of decoys. In this study, we developed HawkRank, a novel scoring function designed for the sampling stage of protein-protein docking by summing the contributions from several energy terms, including van der Waals potentials, electrostatic potentials and desolvation potentials. First, based on the solvation free energies predicted by the Generalized Born model for ~ 800 proteins, a SASA (solvent accessible surface area)-based solvation model was developed, which can give the aqueous solvation free energies for proteins by summing the contributions of 21 atom types. Then, the van der Waals potentials and electrostatic potentials based on the Amber ff14SB force field were computed. Finally, the HawkRank scoring function was derived by determining the most optimal weights for five energy terms based on the training set. Here, MSR (modified success rate), a novel protein-protein scoring quality index, was used to assess the performance of HawkRank and three other popular protein-protein scoring functions, including ZRANK, FireDock and dDFIRE. The results show that HawkRank outperformed the other three scoring functions according to the total number of hits and MSR. HawkRank is available at http://cadd.zju.edu.cn/programs/hawkrank .
Project description:We have calculated the stability of decoy structures of several proteins (from the CASP3 models and the Park and Levitt decoy set) relative to the native structures. The calculations were performed with the force field-consistent ES/IS method, in which an implicit solvent (IS) model is used to calculate the average solvation free energy for snapshots from explicit simulations (ESs). The conformational free energy is obtained by adding the internal energy of the solute from the ESs and an entropic term estimated from the covariance positional fluctuation matrix. The set of atomic Born radii and the cavity-surface free energy coefficient used in the implicit model has been optimized to be consistent with the all-atom force field used in the ESs (cedar/gromos with simple point charge (SPC) water model). The decoys are found to have a consistently higher free energy than that of the native structure; the gap between the native structure and the best decoy varies between 10 and 15 kcal/mole, on the order of the free energy difference that typically separates the native state of a protein from the unfolded state. The correlation between the free energy and the extent to which the decoy structures differ from the native (as root mean square deviation) is very weak; hence, the free energy is not an accurate measure for ranking the structurally most native-like structures from among a set of models. Analysis of the energy components shows that stability is attained as a result of three major driving forces: (1) minimum size of the protein-water surface interface; (2) minimum total electrostatic energy, which includes solvent polarization; and (3) minimum protein packing energy. The detailed fit required to optimize the last term may underlie difficulties encountered in recovering the native fold from an approximate decoy or model structure.
Project description:BACKGROUND:We present a simple method to train a potential function for the protein folding problem which, even though trained using a small number of proteins, is able to place a significantly large number of native conformations near a local minimum. The training relies on generating decoys by energy minimization of the native conformations using the current potential and using a physically meaningful objective function (derivative of energy with respect to torsion angles at the native conformation) during the quadratic programming to place the native conformation near a local minimum. RESULTS:We also compare the performance of three different types of energy functions and find that while the pairwise energy function is trainable, a solvation energy function by itself is untrainable if decoys are generated by minimizing the current potential starting at the native conformation. The best results are obtained when a pairwise interaction energy function is used with solvation energy function. CONCLUSIONS:We are able to train a potential function using six proteins which places a total of 42 native conformations within approximately 4 A rmsd and 71 native conformations within approximately 6 A rmsd of a local minimum out of a total of 91 proteins. Furthermore, the threading test using the same 91 proteins ranks 89 native conformations to be first and the other two as second.
Project description:Central in the variational implicit-solvent model (VISM) [Dzubiella, Swanson, and McCammon Phys. Rev. Lett.2006, 96, 087802 and J. Chem. Phys.2006, 124, 084905] of molecular solvation is a mean-field free-energy functional of all possible solute-solvent interfaces or dielectric boundaries. Such a functional can be minimized numerically by a level-set method to determine stable equilibrium conformations and solvation free energies. Applications to nonpolar systems have shown that the level-set VISM is efficient and leads to qualitatively and often quantitatively correct results. In particular, it is capable of capturing capillary evaporation in hydrophobic confinement and corresponding multiple equilibrium states as found in molecular dynamics (MD) simulations. In this work, we introduce into the VISM the Coulomb-field approximation of the electrostatic free energy. Such an approximation is a volume integral over an arbitrary shaped solvent region, requiring no solutions to any partial differential equations. With this approximation, we obtain the effective boundary force and use it as the "normal velocity" in the level-set relaxation. We test the new approach by calculating solvation free energies and potentials of mean force for small and large molecules, including the two-domain protein BphC. Our results reveal the importance of coupling polar and nonpolar interactions in the underlying molecular systems. In particular, dehydration near the domain interface of BphC subunits is found to be highly sensitive to local electrostatic potentials as seen in previous MD simulations. This is a first step toward capturing the complex protein dehydration process by an implicit-solvent approach.
Project description:Molecular dynamics-based free energy calculations allow the determination of a variety of thermodynamic quantities from computer simulations of small molecules. Thermodynamic integration (TI) calculations can suffer from instabilities during the creation or annihilation of particles. This "singularity" problem can be addressed with "soft-core" potential functions which keep pairwise interaction energies finite for all configurations and provide smooth free energy curves. "One-step" transformations, in which electrostatic and van der Waals forces are simultaneously modified, can be simpler and less expensive than "two-step" transformations in which these properties are changed in separate calculations. Here, we study solvation free energies for molecules of different hydrophobicity using both models. We provide recommended values for the two parameters ?(LJ) and ?(C) controlling the behavior of the soft-core Lennard-Jones and Coulomb potentials and compare one- and two-step transformations with regard to their suitability for numerical integration. For many types of transformations, the one-step procedure offers a convenient and accurate approach to free energy estimates.
Project description:A phase-field variational implicit-solvent approach is developed for the solvation of charged molecules. The starting point of such an approach is the representation of a solute-solvent interface by a phase field that takes one value in the solute region and another in the solvent region, with a smooth transition from one to the other on a small transition layer. The minimization of an effective free-energy functional of all possible phase fields determines the equilibrium conformations and free energies of an underlying molecular system. All the surface energy, the solute-solvent van der Waals interaction, and the electrostatic interaction are coupled together self-consistently through a phase field. The surface energy results from the minimization of a double-well potential and the gradient of a field. The electrostatic interaction is described by the Coulomb-field approximation. Accurate and efficient methods are designed and implemented to numerically relax an underlying charged molecular system. Applications to single ions, a two-plate system, and a two-domain protein reveal that the new theory and methods can capture capillary evaporation in hydrophobic confinement and corresponding multiple equilibrium states as found in molecular dynamics simulations. Comparisons of the phase-field and the original sharp-interface variational approaches are discussed.
Project description:BACKGROUND: The reliable prediction of protein tertiary structure from the amino acid sequence remains challenging even for small proteins. We have developed an all-atom free-energy protein forcefield (PFF01) that we could use to fold several small proteins from completely extended conformations. Because the computational cost of de-novo folding studies rises steeply with system size, this approach is unsuitable for structure prediction purposes. We therefore investigate here a low-cost free-energy relaxation protocol for protein structure prediction that combines heuristic methods for model generation with all-atom free-energy relaxation in PFF01. RESULTS: We use PFF01 to rank and cluster the conformations for 32 proteins generated by ROSETTA. For 22/10 high-quality/low quality decoy sets we select near-native conformations with an average Calpha root mean square deviation of 3.03 A/6.04 A. The protocol incorporates an inherent reliability indicator that succeeds for 78% of the decoy sets. In over 90% of these cases near-native conformations are selected from the decoy set. This success rate is rationalized by the quality of the decoys and the selectivity of the PFF01 forcefield, which ranks near-native conformations an average 3.06 standard deviations below that of the relaxed decoys (Z-score). CONCLUSION: All-atom free-energy relaxation with PFF01 emerges as a powerful low-cost approach toward generic de-novo protein structure prediction. The approach can be applied to large all-atom decoy sets of any origin and requires no preexisting structural information to identify the native conformation. The study provides evidence that a large class of proteins may be foldable by PFF01.
Project description:Crystallographic data about T-Cell Receptor - peptide - major histocompatibility complex class I (TCRpMHC) interaction have revealed extremely diverse TCR binding modes triggering antigen recognition. Understanding the molecular basis that governs TCR orientation over pMHC is still a considerable challenge. We present a simplified rigid approach applied on all non-redundant TCRpMHC crystal structures available. The CHARMM force field in combination with the FACTS implicit solvation model is used to study the role of long-distance interactions between the TCR and pMHC. We demonstrate that the sum of the coulomb interactions and the electrostatic solvation energies is sufficient to identify two orientations corresponding to energetic minima at 0° and 180° from the native orientation. Interestingly, these results are shown to be robust upon small structural variations of the TCR such as changes induced by Molecular Dynamics simulations, suggesting that shape complementarity is not required to obtain a reliable signal. Accurate energy minima are also identified by confronting unbound TCR crystal structures to pMHC. Furthermore, we decompose the electrostatic energy into residue contributions to estimate their role in the overall orientation. Results show that most of the driving force leading to the formation of the complex is defined by CDR1,2/MHC interactions. This long-distance contribution appears to be independent from the binding process itself, since it is reliably identified without considering neither short-range energy terms nor CDR induced fit upon binding. Ultimately, we present an attempt to predict the TCR/pMHC binding mode for a TCR structure obtained by homology modeling. The simplicity of the approach and the absence of any fitted parameters make it also easily applicable to other types of macromolecular protein complexes.
Project description:Protein structure prediction techniques proceed in two steps, namely the generation of many structural models for the protein of interest, followed by an evaluation of all these models to identify those that are native-like. In theory, the second step is easy, as native structures correspond to minima of their free energy surfaces. It is well known however that the situation is more complicated as the current force fields used for molecular simulations fail to recognize native states from misfolded structures. In an attempt to solve this problem, we follow an alternate approach and derive a new potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins. At the short range level, it captures and quantifies the mapping between the sequences and structures of short (7-mer) fragments of protein backbones through the introduction of a new local energy term. The local energy term is then augmented with a nonlocal residue-based pairwise potential, and a solvent potential. Secondly, it is optimized to yield a maximized correlation between the energy of a structural model and its root mean square (RMS) to the native structure of the corresponding protein. We have shown that MPP yields high correlation values between RMS and energy and that it is able to retrieve the native structure of a protein from a set of high-resolution decoys.