Frustration in the energy landscapes of multidomain protein misfolding.
ABSTRACT: Frustration from strong interdomain interactions can make misfolding a more severe problem in multidomain proteins than in single-domain proteins. On the basis of bioinformatic surveys, it has been suggested that lowering the sequence identity between neighboring domains is one of nature's solutions to the multidomain misfolding problem. We investigate folding of multidomain proteins using the associative-memory, water-mediated, structure and energy model (AWSEM), a predictive coarse-grained protein force field. We find that reducing sequence identity not only decreases the formation of domain-swapped contacts but also decreases the formation of strong self-recognition contacts between ?-strands with high hydrophobic content. The ensembles of misfolded structures that result from forming these amyloid-like interactions are energetically disfavored compared with the native state, but entropically favored. Therefore, these ensembles are more stable than the native ensemble under denaturing conditions, such as high temperature. Domain-swapped contacts compete with self-recognition contacts in forming various trapped states, and point mutations can shift the balance between the two types of interaction. We predict that multidomain proteins that lack these specific strong interdomain interactions should fold reliably.
Project description:Recent single molecule experiments, using either atomic force microscopy (AFM) or Förster resonance energy transfer (FRET) have shown that multidomain proteins containing tandem repeats may form stable misfolded structures. Topology-based simulation models have been used successfully to generate models for these structures with domain-swapped features, fully consistent with the available data. However, it is also known that some multidomain protein folds exhibit no evidence for misfolding, even when adjacent domains have identical sequences. Here we pose the question: what factors influence the propensity of a given fold to undergo domain-swapped misfolding? Using a coarse-grained simulation model, we can reproduce the known propensities of multidomain proteins to form domain-swapped misfolds, where data is available. Contrary to what might be naively expected based on the previously described misfolding mechanism, we find that the extent of misfolding is not determined by the relative folding rates or barrier heights for forming the domains present in the initial intermediates leading to folded or misfolded structures. Instead, it appears that the propensity is more closely related to the relative stability of the domains present in folded and misfolded intermediates. We show that these findings can be rationalized if the folded and misfolded domains are part of the same folding funnel, with commitment to one structure or the other occurring only at a relatively late stage of folding. Nonetheless, the results are still fully consistent with the kinetic models previously proposed to explain misfolding, with a specific interpretation of the observed rate coefficients. Finally, we investigate the relation between interdomain linker length and misfolding, and propose a simple alchemical model to predict the propensity for domain-swapped misfolding of multidomain proteins.
Project description:Experiments on artificial multidomain protein constructs have probed the early stages of aggregation processes, but structural details of the species that initiate aggregation remain elusive. Using the associative-memory, water-mediated, structure and energy model known as AWSEM, a transferable coarse-grained protein model, we performed simulations of fused constructs composed of up to four copies of the Titin I27 domain or its mutant I27* (I59E). Free energy calculations enable us to quantify the conditions under which such multidomain constructs will spontaneously misfold. Consistent with experimental results, the dimer of I27 is found to be the smallest spontaneously misfolding construct. Our results show how structurally distinct misfolded states can be stabilized under different thermodynamic conditions, and this result provides a plausible link between the single-molecule misfolding experiments under native conditions and aggregation experiments under denaturing conditions. The conditions for spontaneous misfolding are determined by the interplay among temperature, effective local protein concentration, and the strength of the interdomain interactions. Above the folding temperature, fusing additional domains to the monomer destabilizes the native state, and the entropically stabilized amyloid-like state is favored. Because it is primarily energetically stabilized, the domain-swapped state is more likely to be important under native conditions. Both protofibril-like and branching structures are found in annealing simulations starting from extended structures, and these structures suggest a possible connection between the existence of multiple amyloidogenic segments in each domain and the formation of branched, amorphous aggregates as opposed to linear fibrillar structures.
Project description:A large range of debilitating medical conditions is linked to protein misfolding, which may compete with productive folding particularly in proteins containing multiple domains. Seventy-five per cent of the eukaryotic proteome consists of multidomain proteins, yet it is not understood how interdomain misfolding is avoided. It has been proposed that maintaining low sequence identity between covalently linked domains is a mechanism to avoid misfolding. Here we use single-molecule Förster resonance energy transfer to detect and quantify rare misfolding events in tandem immunoglobulin domains from the I band of titin under native conditions. About 5.5 per cent of molecules with identical domains misfold during refolding in vitro and form an unexpectedly stable state with an unfolding half-time of several days. Tandem arrays of immunoglobulin-like domains in humans show significantly lower sequence identity between neighbouring domains than between non-adjacent domains. In particular, the sequence identity of neighbouring domains has been found to be preferentially below 40 per cent. We observe no misfolding for a tandem of naturally neighbouring domains with low sequence identity (24 per cent), whereas misfolding occurs between domains that are 42 per cent identical. Coarse-grained molecular simulations predict the formation of domain-swapped structures that are in excellent agreement with the observed transfer efficiency of the misfolded species. We infer that the interactions underlying misfolding are very specific and result in a sequence-specific domain-swapping mechanism. Diversifying the sequence between neighbouring domains seems to be a successful evolutionary strategy to avoid misfolding in multidomain proteins.
Project description:Multidomain proteins with two or more independently folded functional domains are prevalent in nature. Whereas most multidomain proteins are linked linearly in sequence, roughly one-tenth possess domain insertions where a guest domain is implanted into a loop of a host domain, such that the two domains are connected by a pair of interdomain linkers. Here, we characterized the influence of the interdomain linkers on the structure and dynamics of a domain-insertion protein in which the guest LysM domain is inserted into a central loop of the host CVNH domain. Expanding upon our previous crystallographic and NMR studies, we applied SAXS in combination with NMR paramagnetic relaxation enhancement to construct a structural model of the overall two-domain system. Although the two domains have no fixed relative orientation, certain orientations were found to be preferred over others. We also assessed the accuracies of molecular mechanics force fields in modeling the structure and dynamics of tethered multidomain proteins by integrating our experimental results with microsecond-scale atomistic molecular dynamics simulations. In particular, our evaluation of two different combinations of the latest force fields and water models revealed that both combinations accurately reproduce certain structural and dynamical properties, but are inaccurate for others. Overall, our study illustrates the value of integrating experimental NMR and SAXS studies with long timescale atomistic simulations for characterizing structural ensembles of flexibly linked multidomain systems.
Project description:This review is a tutorial for scientists interested in the problem of protein structure prediction, particularly those interested in using coarse-grained molecular dynamics models that are optimized using lessons learned from the energy landscape theory of protein folding. We also present a review of the results of the AMH/AMC/AMW/AWSEM family of coarse-grained molecular dynamics protein folding models to illustrate the points covered in the first part of the article. Accurate coarse-grained structure prediction models can be used to investigate a wide range of conceptual and mechanistic issues outside of protein structure prediction; specifically, the paper concludes by reviewing how AWSEM has in recent years been able to elucidate questions related to the unusual kinetic behavior of artificially designed proteins, multidomain protein misfolding, and the initial stages of protein aggregation.
Project description:Folding of small proteins often occurs in a two-state manner and is well understood both experimentally and theoretically. However, many proteins are much larger and often populate misfolded states, complicating their folding process significantly. Here we study the complete folding and assembly process of the 1,418 amino acid, dimeric chaperone Hsp90 using single-molecule optical tweezers. Although the isolated C-terminal domain shows two-state folding, we find that the isolated N-terminal as well as the middle domain populate ensembles of fast-forming, misfolded states. These intradomain misfolds slow down folding by an order of magnitude. Modeling folding as a competition between productive and misfolding pathways allows us to fully describe the folding kinetics. Beyond intradomain misfolding, folding of the full-length protein is further slowed by the formation of interdomain misfolds, suggesting that with growing chain lengths, such misfolds will dominate folding kinetics. Interestingly, we find that small stretching forces applied to the chain can accelerate folding by preventing the formation of cross-domain misfolding intermediates by leading the protein along productive pathways to the native state. The same effect is achieved by cotranslational folding at the ribosome in vivo.
Project description:The N-terminal regulatory domains of bacterial response regulator proteins catalyze phosphoryl transfer and function as phosphorylation-dependent regulatory switches to control the output activities of C-terminal effector domains. Structures of numerous isolated regulatory and effector domains have been determined. However, a detailed understanding of regulatory interactions among these domains has been limited by the relative paucity of structural data for intact multidomain response regulator proteins. The first multidomain structures determined, those of transcription factor NarL and methylesterase CheB, both revealed extensive interdomain interfaces. The regulatory domains obstruct access to the functional sites of the effector domains, indicating a regulatory mechanism based on inhibition. In contrast, the recently determined structure of the OmpR/PhoB homologue DrrD revealed no significant interdomain interface, suggesting that the domains are tethered by a flexible linker and lack a fixed orientation relative to each other. To address the generality of this feature, we have determined the 1.8-A resolution crystal structure of Thermotoga maritima DrrB, providing a second structure of a multidomain response regulator of the OmpR/PhoB subfamily. The structure reveals an extensive domain interface of 751 A(2) and therefore differs greatly from that observed in DrrD. Residues that are crucial players in defining the activation state of the regulatory domain contribute to this interface, implying that conformational changes associated with phosphorylation will influence these intramolecular contacts. The DrrB and DrrD structures are suggestive of different signaling mechanisms, with intramolecular communication between N- and C-terminal domains making substantially different contributions to effector domain regulation in individual members of the OmpR/PhoB family.
Project description:The associative memory, water-mediated, structure and energy model (AWSEM) has been successfully used to study protein folding, binding, and aggregation problems. In this work, we introduce AWSEM-IDP, a new AWSEM branch for simulating intrinsically disordered proteins (IDPs), where the weights of the potentials determining secondary structure formation have been finely tuned, and a novel potential is introduced that helps to precisely control both the average extent of protein chain collapse and the chain's fluctuations in size. AWSEM-IDP can efficiently sample large conformational spaces, while retaining sufficient molecular accuracy to realistically model proteins. We applied this new model to two IDPs, demonstrating that AWSEM-IDP can reasonably well reproduce higher-resolution reference data, thus providing the foundation for a transferable IDP force field. Finally, we used thermodynamic perturbation theory to show that, in general, the conformational ensembles of IDPs are highly sensitive to fine-tuning of force field parameters.
Project description:Many signaling proteins consist of globular domains connected by flexible linkers that allow for substantial domain motion. Because these domains often serve as complementary functional modules, the possibility of functionally important domain motions arises. To explore this possibility, we require knowledge of the ensemble of protein conformations sampled by interdomain motion. Measurements of NMR residual dipolar couplings (RDCs) of backbone HN bonds offer a per-residue characterization of interdomain dynamics, as the couplings are sensitive to domain orientation. A challenge in reaching this potential is the need to interpret the RDCs as averages over dynamic ensembles of domain conformations. Here, we address this challenge by introducing an efficient protocol for generating conformational ensembles appropriate for flexible, multi-domain proteins. The protocol uses map-restrained self-guided Langevin dynamics simulations to promote collective, interdomain motion while restraining the internal domain motion to near rigidity. Critically, the simulations retain an all-atom description for facile inclusion of site-specific NMR RDC restraints. The result is the rapid generation of conformational ensembles consistent with the RDC data. We illustrate this protocol on human Pin1, a two-domain peptidyl-prolyl isomerase relevant for cancer and Alzheimer's disease. The results include the ensemble of domain orientations sampled by Pin1, as well as those of a dysfunctional variant, I28A-Pin1. The differences between the ensembles corroborate our previous spin relaxation results that showed weakened interdomain contact in the I28A variant relative to wild type. Our protocol extends our abilities to explore the functional significance of protein domain motions.
Project description:Previous studies of the N-terminal PDZ tandem from PSD-95 produced divergent models and failed to identify interdomain contacts stabilizing the structure. We used ensemble and single-molecule FRET along with replica-exchange molecular dynamics to fully characterize the energy landscape. Simulations and experiments identified two conformations: an open-like conformation with a small contact interface stabilized by salt bridges, and a closed-like conformation with a larger contact interface stabilized by surface-exposed hydrophobic residues. Both interfaces were confirmed experimentally. Proximity of interdomain contacts to the binding pockets may explain the observed coupling between conformation and binding. The low-energy barrier between conformations allows submillisecond dynamics, which were time-averaged in previous NMR and FRET studies. Moreover, the small contact interfaces were likely overridden by lattice contacts as crystal structures were rarely sampled in simulations. Our hybrid approach can identify transient interdomain interactions, which are abundant in multidomain proteins yet often obscured by dynamic averaging.