Evidence, from simulations, of a single state with residual native structure at the thermal denaturation midpoint of a small globular protein.
ABSTRACT: The folding of the B-domain of staphylococcal protein A has been studied by coarse-grained canonical and multiplexed replica-exchange molecular dynamics simulations with the UNRES force field in a broad range of temperatures (270 K < or = T < or = 350 K). In canonical simulations, the folding was found to occur either directly to the native state or through kinetic traps, mainly the topological mirror image of the native three-helix bundle. The latter folding scenario was observed more frequently at low temperatures. With increase of temperature, the frequency of the transitions between the folded and misfolded/unfolded states increased and the folded state became more diffuse with conformations exhibiting increased root-mean-square deviations from the experimental structure (from about 4 A at T = 300 K to 8.7 A at T = 325 K). An analysis of the equilibrium conformational ensemble determined from multiplexed replica exchange simulations at the folding-transition temperature (T(f) = 325 K) showed that the conformational ensemble at this temperature is a collection of conformations with residual secondary structures, which possess native or near-native clusters of nonpolar residues in place, and not a 50-50% mixture of fully folded and fully unfolded conformations. These findings contradict the quasi-chemical picture of two- or multistate protein folding, which assumes an equilibrium between the folded, unfolded, and intermediate states, with equilibrium shifting with temperature but with the native conformations remaining essentially unchanged. Our results also suggest that long-range hydrophobic contacts are the essential factor to keep the structure of a protein thermally stable.
Project description:We study the unbiased folding/unfolding thermodynamics of the Trp-cage miniprotein using detailed molecular dynamics simulations of an all-atom model of the protein in explicit solvent using the Amberff99SB force field. Replica-exchange molecular dynamics simulations are used to sample the protein ensembles over a broad range of temperatures covering the folded and unfolded states at two densities. The obtained ensembles are shown to reach equilibrium in the 1 mus/replica timescale. The total simulation time used in the calculations exceeds 100 mus. Ensemble averages of the fraction folded, pressure, and energy differences between the folded and unfolded states as a function of temperature are used to model the free energy of the folding transition, DeltaG(P, T), over the whole region of temperatures and pressures sampled in the simulations. The DeltaG(P, T) diagram describes an ellipse over the range of temperatures and pressures sampled, predicting that the system can undergo pressure-induced unfolding and cold denaturation at low temperatures and high pressures, and unfolding at low pressures and high temperatures. The calculated free energy function exhibits remarkably good agreement with the experimental folding transition temperature (T(f) = 321 K), free energy, and specific heat changes. However, changes in enthalpy and entropy are significantly different than the experimental values. We speculate that these differences may be due to the simplicity of the semiempirical force field used in the simulations and that more elaborate force fields may be required to describe appropriately the thermodynamics of proteins.
Project description:We present a new approach to study a multitude of folding pathways and different folding mechanisms for the 20-residue mini-protein Trp-Cage using the combined power of replica exchange molecular dynamics (REMD) simulations for conformational sampling, transition path theory (TPT) for constructing folding pathways, and stochastic simulations for sampling the pathways in a high dimensional structure space. REMD simulations of Trp-Cage with 16 replicas at temperatures between 270 and 566 K are carried out with an all-atom force field (OPLSAA) and an implicit solvent model (AGBNP). The conformations sampled from all temperatures are collected. They form a discretized state space that can be used to model the folding process. The equilibrium population for each state at a target temperature can be calculated using the weighted-histogram-analysis method (WHAM). By connecting states with similar structures and creating edges satisfying detailed balance conditions, we construct a kinetic network that preserves the equilibrium population distribution of the state space. After defining the folded and unfolded macrostates, committor probabilities (P(fold)) are calculated by solving a set of linear equations for each node in the network and pathways are extracted together with their fluxes using the TPT algorithm. By clustering the pathways into folding "tubes", a more physically meaningful picture of the diversity of folding routes emerges. Stochastic simulations are carried out on the network, and a procedure is developed to project sampled trajectories onto the folding tubes. The fluxes through the folding tubes calculated from the stochastic trajectories are in good agreement with the corresponding values obtained from the TPT analysis. The temperature dependence of the ensemble of Trp-Cage folding pathways is investigated. Above the folding temperature, a large number of diverse folding pathways with comparable fluxes flood the energy landscape. At low temperature, however, the folding transition is dominated by only a few localized pathways.
Project description:Simulations of first-passage folding of the antiparallel β-sheet miniprotein beta3s, which has been intensively studied under equilibrium conditions by A. Caflisch and co-workers, show that the kinetics and dynamics are significantly different from those for equilibrium folding. Because the folding of a protein in a living system generally corresponds to the former (i.e., the folded protein is stable and unfolding is a rare event), the difference is of interest. In contrast to equilibrium folding, the Ch-curl conformations become very rare because they contain unfavorable parallel β-strand arrangements, which are difficult to form dynamically due to the distant N- and C-terminal strands. At the same time, the formation of helical conformations becomes much easier (particularly in the early stage of folding) due to short-range contacts. The hydrodynamic descriptions of the folding reaction have also revealed that while the equilibrium flow field presented a collection of local vortices with closed "streamlines", the first-passage folding is characterized by a pronounced overall flow from the unfolded states to the native state. The flows through the locally stable structures Cs-or and Ns-or, which are conformationally close to the native state, are negligible due to detailed balance established between these structures and the native state. Although there are significant differences in the general picture of the folding process from the equilibrium and first-passage folding simulations, some aspects of the two are in agreement. The rate of transitions between the clusters of characteristic protein conformations in both cases decreases approximately exponentially with the distance between the clusters in the hydrogen bond distance space of collective variables, and the folding time distribution in the first-passage segments of the equilibrium trajectory is in good agreement with that for the first-passage folding simulations.
Project description:The LYS24/29NLE double mutant of villin headpiece subdomain (HP35) is the fastest folding protein known so far with a folding time constant of 0.6 micros. In this work, the folding mechanism of the mutant has been investigated by both conventional and replica exchange molecular dynamics (CMD and REMD) simulations with AMBER FF03 force field and a generalized-Born solvation model. Direct comparison to the ab initio folding of the wild type HP35 enabled a close examination on the mutational effect on the folding process. The mutant folded to the native state, as demonstrated by the 0.50 A C(alpha)-root mean square deviation (RMSD) sampled in both CMD and REMD simulations and the high population of the folded conformation compared with the denatured conformations. Consistent with experiments, the significantly reduced primary folding free energy barrier makes the mutant closer to a downhill folder than the wild type HP35 that directly leads to the faster transition and higher melting temperature. However, unlike the proposed downhill folding which envisages a smooth shift between unfolded and folded states without transition barrier, we observed a well-defined folding transition that was consistent with experiments. Further examination of the secondary structures revealed that the two mutated residues have higher intrinsic helical preference that facilitated the formation of both helix III and the intermediate state which contains the folded segment helix II/III. Other factors contributing to the faster folding include the more favorable electrostatic interactions in the transition state with the removal of the charged NH(3)(+) groups from LYS. In addition, both transition state ensemble and denatured state ensemble are shifted in the mutant.
Project description:Hairpin loop structures are common motifs in folded nucleic acids. The 5'-GCGCAGC sequence in DNA forms a characteristic and stable trinucleotide hairpin loop flanked by a two basepair stem helix. To better understand the structure formation of this hairpin loop motif in atomic detail, we employed replica-exchange molecular dynamics (RexMD) simulations starting from a single-stranded DNA conformation. In two independent 36 ns RexMD simulations, conformations in very close agreement with the experimental hairpin structure were sampled as dominant conformations (lowest free energy state) during the final phase of the RexMDs ( approximately 35% at the lowest temperature replica). Simultaneous compaction and accumulation of folded structures were observed. Comparison of the GCA trinucleotides from early stages of the simulations with the folded topology indicated a variety of central loop conformations, but arrangements close to experiment that are sampled before the fully folded structure also appeared. Most of these intermediates included a stacking of the C(2) and G(3) bases, which was further stabilized by hydrogen bonding to the A(5) base and a strongly bound water molecule bridging the C(2) and A(5) in the DNA minor groove. The simulations suggest a folding mechanism where these intermediates can rapidly proceed toward the fully folded hairpin and emphasize the importance of loop and stem nucleotide interactions for hairpin folding. In one simulation, a loop motif with G(3) in syn conformation (dihedral flip at N-glycosidic bond) accumulated, resulting in a misfolded hairpin. Such conformations may correspond to long-lived trapped states that have been postulated to account for the folding kinetics of nucleic acid hairpins that are slower than expected for a semiflexible polymer of the same size.
Project description:We report the modification and parametrization of the united-residue (UNRES) force field for energy-based protein structure prediction and protein folding simulations. We tested the approach on three training proteins separately: 1E0L (beta), 1GAB (alpha), and 1E0G (alpha + beta). Heretofore, the UNRES force field had been designed and parametrized to locate native-like structures of proteins as global minima of their effective potential energy surfaces, which largely neglected the conformational entropy because decoys composed of only lowest-energy conformations were used to optimize the force field. Recently, we developed a mesoscopic dynamics procedure for UNRES and applied it with success to simulate protein folding pathways. However, the force field turned out to be largely biased toward -helical structures in canonical simulations because the conformational entropy had been neglected in the parametrization. We applied the hierarchical optimization method, developed in our earlier work, to optimize the force field; in this method, the conformational space of a training protein is divided into levels, each corresponding to a certain degree of native-likeness. The levels are ordered according to increasing native-likeness; level 0 corresponds to structures with no native-like elements, and the highest level corresponds to the fully native-like structures. The aim of optimization is to achieve the order of the free energies of levels, decreasing as their native-likeness increases. The procedure is iterative, and decoys of the training protein(s) generated with the energy function parameters of the preceding iteration are used to optimize the force field in a current iteration. We applied the multiplexing replica-exchange molecular dynamics (MREMD) method, recently implemented in UNRES, to generate decoys; with this modification, conformational entropy is taken into account. Moreover, we optimized the free-energy gaps between levels at temperatures corresponding to a predominance of folded or unfolded structures, as well as to structures at the putative folding-transition temperature, changing the sign of the gaps at the transition temperature. This enabled us to obtain force fields characterized by a single peak in the heat capacity at the transition temperature. Furthermore, we introduced temperature dependence to the UNRES force field; this is consistent with the fact that it is a free-energy and not a potential energy function. beta
Project description:Reaching the native states of small proteins, a necessary step towards a comprehensive understanding of the folding mechanisms, has remained a tremendous challenge to ab initio protein folding simulations despite the extensive effort. In this work, the folding process of the B domain of protein A (BdpA) has been simulated by both conventional and replica exchange molecular dynamics using AMBER FF03 all-atom force field. Started from an extended chain, a total of 40 conventional (each to 1.0 micros) and two sets of replica exchange (each to 200.0 ns per replica) molecular dynamics simulations were performed with different generalized-Born solvation models and temperature control schemes. The improvements in both the force field and solvent model allowed successful simulations of the folding process to the native state as demonstrated by the 0.80 A C(alpha) root mean square deviation (RMSD) of the best folded structure. The most populated conformation was the native folded structure with a high population. This was a significant improvement over the 2.8 A C(alpha) RMSD of the best nativelike structures from previous ab initio folding studies on BdpA. To the best of our knowledge, our results demonstrate, for the first time, that ab initio simulations can reach the native state of BdpA. Consistent with experimental observations, including Phi-value analyses, formation of helix II/III hairpin was a crucial step that provides a template upon which helix I could form and the folding process could complete. Early formation of helix III was observed which is consistent with the experimental results of higher residual helical content of isolated helix III among the three helices. The calculated temperature-dependent profile and the melting temperature were in close agreement with the experimental results. The simulations further revealed that phenylalanine 31 may play critical to achieve the correct packing of the three helices which is consistent with the experimental observation. In addition to the mechanistic studies, an ab initio structure prediction was also conducted based on both the physical energy and a statistical potential. Based on the lowest physical energy, the predicted structure was 2.0 A C(alpha) RMSD away from the experimentally determined structure.
Project description:As they are not subjected to natural selection process, de novo designed proteins usually fold in a manner different from natural proteins. Recently, a de novo designed mini-protein DS119, with a ??? motif and 36 amino acids, has folded unusually slowly in experiments, and transient dimers have been detected in the folding process. Here, by means of all-atom replica exchange molecular dynamics (REMD) simulations, several comparably stable intermediate states were observed on the folding free-energy landscape of DS119. Conventional molecular dynamics (CMD) simulations showed that when two unfolded DS119 proteins bound together, most binding sites of dimeric aggregates were located at the N-terminal segment, especially residues 5-10, which were supposed to form ?-sheet with its own C-terminal segment. Furthermore, a large percentage of individual proteins in the dimeric aggregates adopted conformations similar to those in the intermediate states observed in REMD simulations. These results indicate that, during the folding process, DS119 can easily become trapped in intermediate states. Then, with diffusion, a transient dimer would be formed and stabilized with the binding interface located at N-terminals. This means that it could not quickly fold to the native structure. The complicated folding manner of DS119 implies the important influence of natural selection on protein-folding kinetics, and more improvement should be achieved in rational protein design.
Project description:The conformational space of a 20-residue three-stranded antiparallel beta-sheet peptide (double hairpin) was sampled by equilibrium folding/unfolding molecular dynamics simulations for a total of 20 micros. The resulting one-dimensional free-energy profiles (FEPs) provide a detailed description of the free-energy basins and barriers for the folding reaction. The similarity of the FEPs obtained using the probability of folding before unfolding (pfold) or the mean first passage time supports the robustness of the procedure. The folded state and the most populated free-energy basins in the denatured state are described by the one-dimensional FEPs, which avoid the overlap of states present in the usual one- or two-dimensional projections. Within the denatured state, a basin with fluctuating helical conformations and a heterogeneous entropic state are populated near the melting temperature at about 11% and 33%, respectively. Folding pathways from the helical basin or enthalpic traps (with only one of the two hairpins formed) reach the native state through the entropic state, which is on-pathway and is separated by a low barrier from the folded state. A simplified equilibrium kinetic network based on the FEPs shows the complexity of the folding reaction and indicates, as augmented by additional analyses, that the basins in the denatured state are connected primarily by the native state. The overall folding kinetics shows single-exponential behavior because barriers between the non-native basins and the folded state have similar heights.
Project description:Many proteins comprising of complex topologies require molecular chaperones to achieve their unique three-dimensional folded structure. The E.coli chaperone, GroEL binds with a large number of unfolded and partially folded proteins, to facilitate proper folding and prevent misfolding and aggregation. Although the major structural components of GroEL are well defined, scaffolds of the non-native substrates that determine chaperone-mediated folding have been difficult to recognize. Here we performed all-atomistic and replica-exchange molecular dynamics simulations to dissect non-native ensemble of an obligate GroEL folder, DapA. Thermodynamics analyses of unfolding simulations revealed populated intermediates with distinct structural characteristics. We found that surface exposed hydrophobic patches are significantly increased, primarily contributed from native and non-native ?-sheet elements. We validate the structural properties of these conformers using experimental data, including circular dichroism (CD), 1-anilinonaphthalene-8-sulfonic acid (ANS) binding measurements and previously reported hydrogen-deutrium exchange coupled to mass spectrometry (HDX-MS). Further, we constructed network graphs to elucidate long-range intra-protein connectivity of native and intermediate topologies, demonstrating regions that serve as central "hubs". Overall, our results implicate that genomic variations (or mutations) in the distinct regions of protein structures might disrupt these topological signatures disabling chaperone-mediated folding, leading to formation of aggregates.