Dynameomics: a consensus view of the protein unfolding/folding transition state ensemble across a diverse set of protein folds.
ABSTRACT: The Dynameomics project aims to simulate a representative sample of all globular protein metafolds under both native and unfolding conditions. We have identified protein unfolding transition state (TS) ensembles from multiple molecular dynamics simulations of high-temperature unfolding in 183 structurally distinct proteins. These data can be used to study individual proteins and individual protein metafolds and to mine for TS structural features common across all proteins. Separating the TS structures into four different fold classes (all proteins, all-alpha, all-beta, and mixed alpha/beta and alpha +beta) resulted in no significant difference in the overall protein properties. The residues with the most contacts in the native state lost the most contacts in the TS ensemble. On average, residues beginning in an alpha-helix maintained more structure in the TS ensemble than did residues starting in beta-strands or any other conformation. The metafolds studied here represent 67% of all known protein structures, and this is, to our knowledge, the largest, most comprehensive study of the protein folding/unfolding TS ensemble to date. One might have expected broad distributions in the average global properties of the TS relative to the native state, indicating variability in the amount of structure present in the TS. Instead, the average global properties converged with low standard deviations across metafolds, suggesting that there are general rules governing the structure and properties of the TS.
Project description:A previously introduced kinetic-rate constant (k/k(0)) method, where k and k(0) are the folding (unfolding) rate constants in the mutant and the wild-type forms, respectively, of a protein, has been applied to obtain qualitative information about structure in the transition state ensemble (TSE) of bovine pancreatic ribonuclease A (RNase A), which contains four native disulfide bonds. The method compares the folding (unfolding) kinetics of RNase A, with and without a covalent crosslink and tests whether the crosslinked residues are associated in the folding (unfolding) transition state (TS) of the noncrosslinked version. To confirm that the fifth disulfide bond has not introduced a significant structural perturbation, we solved the crystal structure of the V43C-R85C mutant to 1.6 A resolution. Our findings suggest that residues Val43 and Arg85 are not associated, and that residues Ala4 and Val118 may form nonnative contacts, in the folding (unfolding) TSE of RNase A.
Project description:Characterization of the folding transition-state ensemble and the denatured-state ensemble is an important step toward a full elucidation of protein folding mechanisms. We report herein an investigation of the free-energy landscape of FSD-1 protein by a total of four sets of folding and unfolding molecular dynamics simulations with explicit solvent. The transition-state ensemble was initially identified from unfolding simulations at 500 K and was verified by simulations at 300 K starting from the ensemble structures. The denatured-state ensemble and the early-stage folding were studied by a combination of unfolding simulations at 500 K and folding simulations at 300 K starting from the extended conformation. A common feature of the transition-state ensemble was the substantial formation of the native secondary structures, including both the alpha-helix and beta-sheet, with partial exposure of the hydrophobic core in the solvent. Both the native and non-native secondary structures were observed in the denatured-state ensemble and early-stage folding, consistent with the smooth experimental melting curve. Interestingly, the contact orders of the transition-state ensemble structures were similar to that of the native structure and were notably lower than those of the compact structures found in early-stage folding, implying that chain and topological entropy might play significant roles in protein folding. Implications for FSD-1 folding mechanisms and the rate-limiting step are discussed. Analyses further revealed interesting non-native interactions in the denatured-state ensemble and early-stage folding and the possibility that destabilization of these interactions could help to enhance the stability and folding rate of the protein.
Project description:Structural insights into the equilibrium folding mechanism of the alpha subunit of tryptophan synthase (alpha TS) from Escherichia coli, a (beta alpha)(8) TIM barrel protein, were obtained with a pair of complementary nuclear magnetic resonance (NMR) spectroscopic techniques. The secondary structures of rare high-energy partially folded states were probed by native-state hydrogen-exchange NMR analysis of main-chain amide hydrogens. 2D heteronuclear single quantum coherence NMR analysis of several (15)N-labeled nonpolar amino acids was used to probe the side chains involved in stabilizing a highly denatured intermediate that is devoid of secondary structure. The dynamic broadening of a subset of isoleucine and leucine side chains and the absence of protection against exchange showed that the highest energy folded state on the free-energy landscape is stabilized by a hydrophobic cluster lacking stable secondary structure. The core of this cluster, centered near the N-terminus of alpha TS, serves as a nucleus for the stabilization of what appears to be nonnative secondary structure in a marginally stable intermediate. The progressive decrease in protection against exchange from this nucleus toward both termini and from the N-termini to the C-termini of several beta-strands is best described by an ensemble of weakly coupled conformers. Comparison with previous data strongly suggests that this ensemble corresponds to a marginally stable off-pathway intermediate that arises in the first few milliseconds of folding and persists under equilibrium conditions. A second, more stable intermediate, which has an intact beta-barrel and a frayed alpha-helical shell, coexists with this marginally stable species. The conversion of the more stable intermediate to the native state of alpha TS entails the formation of a stable helical shell and completes the acquisition of the tertiary structure.
Project description:In Phi-value analysis, the effects of mutations on the folding kinetics are compared with the corresponding effects on thermodynamic stability to investigate the structure of the protein-folding transition state (TS). Here, molecular dynamics (MD) simulations (totaling 0.65 ms) have been performed for a large set of single-point mutants of a 20-residue three-stranded antiparallel beta-sheet peptide. Between 57 and 120 folding events were sampled at near equilibrium for each mutant, allowing for accurate estimates of folding/unfolding rates and stability changes. The Phi values calculated from folding and unfolding rates extracted from the MD trajectories are reliable if the stability loss upon mutation is larger than approximately 0.6 kcal/mol, which is observed for 8 of the 32 single-point mutants. The same heterogeneity of the TS of the wild type was found in the mutated peptides, showing two possible pathways for folding. Single-point mutations can induce significant TS shifts not always detected by Phi-value analysis. Specific nonnative interactions at the TS were observed in most of the peptides studied here. The interpretation of Phi values based on the ratio of atomic contacts at the TS over the native state, which has been used in the past in MD and Monte Carlo simulations, is in agreement with the TS structures of wild-type peptide. However, Phi values tend to overestimate the nativeness of the TS ensemble, when interpreted neglecting the nonnative interactions.
Project description:To search for submolecular foldon units, the spontaneous reversible unfolding and refolding of staphylococcal nuclease under native conditions was studied by a kinetic native-state hydrogen exchange (HX) method. As for other proteins, it appears that staphylococcal nuclease is designed as an assembly of well-integrated foldon units that may define steps in its folding pathway and may regulate some other functional properties. The HX results identify 34 amide hydrogens that exchange with solvent hydrogens under native conditions by way of large transient unfolding reactions. The HX data for each hydrogen measure the equilibrium stability (Delta G(HX)) and the kinetic unfolding and refolding rates (k(op) and k(cl)) of the unfolding reaction that exposes it to exchange. These parameters separate the 34 identified residues into three distinct HX groupings. Two correspond to clearly defined structural units in the native protein, termed the blue and red foldons. The remaining HX grouping contains residues, not well separated by their HX parameters alone, that represent two other distinct structural units in the native protein, termed the green and yellow foldons. Among these four sets, a last unfolding foldon (blue) unfolds with a rate constant of 6 x 10(-6) s(-1) and free energy equal to the protein's global stability (10.0 kcal/mol). It represents part of the beta-barrel, including mutually H-bonding residues in the beta 4 and beta 5 strands, a part of the beta 3 strand that H-bonds to beta 5, and residues at the N-terminus of the alpha2 helix that is capped by beta 5. A second foldon (green), which unfolds and refolds more rapidly and at slightly lower free energy, includes residues that define the rest of the native alpha2 helix and its C-terminal cap. A third foldon (yellow) defines the mutually H-bonded beta1-beta2-beta 3 meander, completing the native beta-barrel, plus an adjacent part of the alpha1 helix. A final foldon (red) includes residues on remaining segments that are distant in sequence but nearly adjacent in the native protein. Although the structure of the partially unfolded forms closely mimics the native organization, four residues indicate the presence of some nonnative misfolding interactions. Because the unfolding parameters of many other residues are not determined, it seems likely that the concerted foldon units are more extensive than is shown by the 34 residues actually observed.
Project description:The Dynameomics project contains native state and unfolding simulations of 807 protein domains, where each domain is representative of a different metafold; these metafolds encompass ~97% of protein fold space. There is a long-standing question in structural biology as to whether proteins in the same fold family share the same folding/unfolding characteristics. Using molecular dynamics simulations from the Dynameomics project, we conducted a detailed study of protein unfolding/folding pathways for 5 protein domains from the immunoglobulin (Ig)-like ?-sandwich metafold (the highest ranked metafold in our database). The domains have sequence similarities ranging from 4 to 15% and are all from different SCOP superfamilies, yet they share the same overall Ig-like topology. Despite having very different amino acid sequences, the dominant unfolding pathway is very similar for the 5 proteins, and the secondary structures that are peripheral to the aligned, shared core domain add variability to the unfolding pathway. Aligned residues in the core domain display consensus structure in the transition state primarily through conservation of hydrophobic positions. Commonalities in the obligate folding nucleus indicate that insights into the major events in the folding/unfolding of other domains from this metafold may be obtainable from unfolding simulations of a few representative proteins.
Project description:The goal of the Dynameomics project is to perform, store, and analyze molecular dynamics simulations of representative proteins, of all known globular folds, in their native state and along their unfolding pathways. To analyze unfolding simulations, the location of the protein along the unfolding reaction coordinate (RXN) must be determined. Properties such as the fraction of native contacts and radius of gyration are often used; however, there is an issue regarding degeneracy with these properties, as native and nonnative species can overlap. Here, we used 15 physical properties of the protein to construct a multidimensional-embedded, one-dimensional RXN coordinate that faithfully captures the complex nature of unfolding. The unfolding RXN coordinates for 188 proteins (1534 simulations and 22.9 mus in explicit water) were calculated. Native, transition, intermediate, and denatured states were readily identified with the use of this RXN coordinate. A global native ensemble based on the native-state properties of the 188 proteins was created. This ensemble was shown to be effective for calculating RXN coordinates for folds outside the initial 188 targets. These RXN coordinates enable, high-throughput assignment of conformational states, which represents an important step in comparing protein properties across fold space as well as characterizing the unfolding of individual proteins.
Project description:The discovery of new protein folds is a relatively rare occurrence even as the rate of protein structure determination increases. This rarity reinforces the concept of folds as reusable units of structure and function shared by diverse proteins. If the folding mechanism of proteins is largely determined by their topology, then the folding pathways of members of existing folds could encompass the full set used by globular protein domains.We have used recent versions of three common protein domain dictionaries (SCOP, CATH and Dali) to generate a consensus domain dictionary (CDD). Surprisingly, 40% of the metafolds in the CDD are not composed of autonomous structural domains, i.e. they are not plausible independent folding units. This finding has serious ramifications for bioinformatics studies mining these domain dictionaries for globular protein properties. However, our main purpose in deriving this CDD was to generate an updated CDD to choose targets for MD simulation as part of our dynameomics effort, which aims to simulate the native and unfolding pathways of representatives of all globular protein consensus folds (metafolds). Consequently, we also compiled a list of representative protein targets of each metafold in the CDD.This domain dictionary is available at www.dynameomics.org.
Project description:Protein folding mechanisms are probed experimentally using single-point mutant perturbations. The relative effects on the folding (phi-values) and unfolding (1 - phi) rates are used to infer the detailed structure of the transition-state ensemble (TSE). Here we analyze kinetic data on > 800 mutations carried out for 24 proteins with simple kinetic behavior. We find two surprising results: (i) all mutant effects are described by the equation: DeltaDeltaG(double dagger)(unfold)=0.76DeltaDeltaG(eq) +/- 1.8kJ/mol. Therefore all data are consistent with a single phi-value (0.24) with accuracy comparable to experimental precision, suggesting that the structural information in conventional phi-values is low. (ii) phi-values change with stability, increasing in mean value and spread from native to unfolding conditions, and thus cannot be interpreted without proper normalization. We eliminate stability effects calculating the phi-values at the mutant denaturation midpoints; i.e., conditions of zero stability (phi(0)). We then show that the intrinsic variability is phi(0) = 0.36 +/- 0.11, being somewhat larger for beta-sheet-rich proteins than for alpha-helical proteins. Importantly, we discover that phi(0)-values are proportional to how many of the residues surrounding the mutated site are local in sequence. High phi(0)-values correspond to protein surface sites, which have few nonlocal neighbors, whereas core residues with many tertiary interactions produce the lowest phi(0)-values. These results suggest a general mechanism in which the TSE at zero stability is a broad conformational ensemble stabilized by local interactions and without specific tertiary interactions, reconciling phi-values with many other empirical observations.
Project description:One of the outstanding questions in protein folding concerns the degree of heterogeneity in the folding transition state ensemble: does a protein fold via a large multitude of diverse "pathways," or are the elements of native structure assembled in a well defined order? Herein, we build on previous point mutagenesis studies of the src SH3 by directly investigating the association of structural elements and the loss of backbone conformational entropy during folding. Double-mutant analysis of polar residues in the distal beta-hairpin and the diverging turn indicates that the hydrogen bond network between these elements is largely formed in the folding transition state. A 10-glycine insertion in the n-src loop (which connects the distal hairpin and the diverging turn) and a disulfide crosslink at the base of the distal beta-hairpin exclusively affect the folding rate, showing that these structural elements are nearly as ordered in the folding transition state as in the native state. In contrast, crosslinking the base of the RT loop or the N and C termini dramatically slows down the unfolding rate, suggesting that dissociation of the termini and opening of the RT loop precede the rate-limiting step in unfolding. Taken together, these results suggest that essentially all conformations in the folding transition state ensemble have the central three-stranded beta-sheet formed, indicating that, for the src homology 3 domain, there is a discrete order to structure assembly during folding.