Frustration in the energy landscapes of multidomain protein misfolding.
ABSTRACT: Frustration from strong interdomain interactions can make misfolding a more severe problem in multidomain proteins than in single-domain proteins. On the basis of bioinformatic surveys, it has been suggested that lowering the sequence identity between neighboring domains is one of nature's solutions to the multidomain misfolding problem. We investigate folding of multidomain proteins using the associative-memory, water-mediated, structure and energy model (AWSEM), a predictive coarse-grained protein force field. We find that reducing sequence identity not only decreases the formation of domain-swapped contacts but also decreases the formation of strong self-recognition contacts between ?-strands with high hydrophobic content. The ensembles of misfolded structures that result from forming these amyloid-like interactions are energetically disfavored compared with the native state, but entropically favored. Therefore, these ensembles are more stable than the native ensemble under denaturing conditions, such as high temperature. Domain-swapped contacts compete with self-recognition contacts in forming various trapped states, and point mutations can shift the balance between the two types of interaction. We predict that multidomain proteins that lack these specific strong interdomain interactions should fold reliably.
Project description:Recent single molecule experiments, using either atomic force microscopy (AFM) or Förster resonance energy transfer (FRET) have shown that multidomain proteins containing tandem repeats may form stable misfolded structures. Topology-based simulation models have been used successfully to generate models for these structures with domain-swapped features, fully consistent with the available data. However, it is also known that some multidomain protein folds exhibit no evidence for misfolding, even when adjacent domains have identical sequences. Here we pose the question: what factors influence the propensity of a given fold to undergo domain-swapped misfolding? Using a coarse-grained simulation model, we can reproduce the known propensities of multidomain proteins to form domain-swapped misfolds, where data is available. Contrary to what might be naively expected based on the previously described misfolding mechanism, we find that the extent of misfolding is not determined by the relative folding rates or barrier heights for forming the domains present in the initial intermediates leading to folded or misfolded structures. Instead, it appears that the propensity is more closely related to the relative stability of the domains present in folded and misfolded intermediates. We show that these findings can be rationalized if the folded and misfolded domains are part of the same folding funnel, with commitment to one structure or the other occurring only at a relatively late stage of folding. Nonetheless, the results are still fully consistent with the kinetic models previously proposed to explain misfolding, with a specific interpretation of the observed rate coefficients. Finally, we investigate the relation between interdomain linker length and misfolding, and propose a simple alchemical model to predict the propensity for domain-swapped misfolding of multidomain proteins.
Project description:Experiments on artificial multidomain protein constructs have probed the early stages of aggregation processes, but structural details of the species that initiate aggregation remain elusive. Using the associative-memory, water-mediated, structure and energy model known as AWSEM, a transferable coarse-grained protein model, we performed simulations of fused constructs composed of up to four copies of the Titin I27 domain or its mutant I27* (I59E). Free energy calculations enable us to quantify the conditions under which such multidomain constructs will spontaneously misfold. Consistent with experimental results, the dimer of I27 is found to be the smallest spontaneously misfolding construct. Our results show how structurally distinct misfolded states can be stabilized under different thermodynamic conditions, and this result provides a plausible link between the single-molecule misfolding experiments under native conditions and aggregation experiments under denaturing conditions. The conditions for spontaneous misfolding are determined by the interplay among temperature, effective local protein concentration, and the strength of the interdomain interactions. Above the folding temperature, fusing additional domains to the monomer destabilizes the native state, and the entropically stabilized amyloid-like state is favored. Because it is primarily energetically stabilized, the domain-swapped state is more likely to be important under native conditions. Both protofibril-like and branching structures are found in annealing simulations starting from extended structures, and these structures suggest a possible connection between the existence of multiple amyloidogenic segments in each domain and the formation of branched, amorphous aggregates as opposed to linear fibrillar structures.
Project description:A large range of debilitating medical conditions is linked to protein misfolding, which may compete with productive folding particularly in proteins containing multiple domains. Seventy-five per cent of the eukaryotic proteome consists of multidomain proteins, yet it is not understood how interdomain misfolding is avoided. It has been proposed that maintaining low sequence identity between covalently linked domains is a mechanism to avoid misfolding. Here we use single-molecule Förster resonance energy transfer to detect and quantify rare misfolding events in tandem immunoglobulin domains from the I band of titin under native conditions. About 5.5 per cent of molecules with identical domains misfold during refolding in vitro and form an unexpectedly stable state with an unfolding half-time of several days. Tandem arrays of immunoglobulin-like domains in humans show significantly lower sequence identity between neighbouring domains than between non-adjacent domains. In particular, the sequence identity of neighbouring domains has been found to be preferentially below 40 per cent. We observe no misfolding for a tandem of naturally neighbouring domains with low sequence identity (24 per cent), whereas misfolding occurs between domains that are 42 per cent identical. Coarse-grained molecular simulations predict the formation of domain-swapped structures that are in excellent agreement with the observed transfer efficiency of the misfolded species. We infer that the interactions underlying misfolding are very specific and result in a sequence-specific domain-swapping mechanism. Diversifying the sequence between neighbouring domains seems to be a successful evolutionary strategy to avoid misfolding in multidomain proteins.
Project description:Multidomain proteins with two or more independently folded functional domains are prevalent in nature. Whereas most multidomain proteins are linked linearly in sequence, roughly one-tenth possess domain insertions where a guest domain is implanted into a loop of a host domain, such that the two domains are connected by a pair of interdomain linkers. Here, we characterized the influence of the interdomain linkers on the structure and dynamics of a domain-insertion protein in which the guest LysM domain is inserted into a central loop of the host CVNH domain. Expanding upon our previous crystallographic and NMR studies, we applied SAXS in combination with NMR paramagnetic relaxation enhancement to construct a structural model of the overall two-domain system. Although the two domains have no fixed relative orientation, certain orientations were found to be preferred over others. We also assessed the accuracies of molecular mechanics force fields in modeling the structure and dynamics of tethered multidomain proteins by integrating our experimental results with microsecond-scale atomistic molecular dynamics simulations. In particular, our evaluation of two different combinations of the latest force fields and water models revealed that both combinations accurately reproduce certain structural and dynamical properties, but are inaccurate for others. Overall, our study illustrates the value of integrating experimental NMR and SAXS studies with long timescale atomistic simulations for characterizing structural ensembles of flexibly linked multidomain systems.
Project description:The functional mechanisms of multidomain proteins often exploit interdomain interactions, or "cross-talk." An example is human Pin1, an essential mitotic regulator consisting of a Trp-Trp (WW) domain flexibly tethered to a peptidyl-prolyl isomerase (PPIase) domain, resulting in interdomain interactions important for Pin1 function. Substrate binding to the WW domain alters its transient contacts with the PPIase domain via means that are only partially understood. Accordingly, we have investigated Pin1 interdomain interactions using NMR paramagnetic relaxation enhancement (PRE) and molecular dynamics (MD) simulations. The PREs show that apo-Pin1 samples interdomain contacts beyond the range suggested by previous structural studies. They further show that substrate binding to the WW domain simultaneously alters interdomain separation and the internal conformation of the WW domain. A 4.5-μs all-atom MD simulation of apo-Pin1 suggests that the fluctuations of interdomain distances are correlated with fluctuations of WW domain interresidue contacts involved in substrate binding. Thus, the interdomain/WW domain conformations sampled by apo-Pin1 may already include a range of conformations appropriate for binding Pin1's numerous substrates. The proposed coupling between intra-/interdomain conformational fluctuations is a consequence of the dynamic modular architecture of Pin1. Such modular architecture is common among cell-cycle proteins; thus, the WW-PPIase domain cross-talk mechanisms of Pin1 may be relevant for their mechanisms as well.
Project description:This review is a tutorial for scientists interested in the problem of protein structure prediction, particularly those interested in using coarse-grained molecular dynamics models that are optimized using lessons learned from the energy landscape theory of protein folding. We also present a review of the results of the AMH/AMC/AMW/AWSEM family of coarse-grained molecular dynamics protein folding models to illustrate the points covered in the first part of the article. Accurate coarse-grained structure prediction models can be used to investigate a wide range of conceptual and mechanistic issues outside of protein structure prediction; specifically, the paper concludes by reviewing how AWSEM has in recent years been able to elucidate questions related to the unusual kinetic behavior of artificially designed proteins, multidomain protein misfolding, and the initial stages of protein aggregation.
Project description:Proteins have evolved by incorporating several structural units within a single polypeptide. As a result, multidomain proteins constitute a large fraction of all proteomes. Their domains often fold to their native structures individually and vectorially as each domain emerges from the ribosome or the protein translocation channel, leading to the decreased risk of interdomain misfolding. However, some multidomain proteins fold in the endoplasmic reticulum (ER) nonvectorially via intermediates with nonnative disulfide bonds, which were believed to be shuffled to native ones slowly after synthesis. Yet, the mechanism by which they fold nonvectorially remains unclear. Using two-dimensional (2D) gel electrophoresis and a conformation-specific antibody that recognizes a correctly folded domain, we show here that shuffling of nonnative disulfide bonds to native ones in the most N-terminal region of LDL receptor (LDLR) started at a specific timing during synthesis. Deletion analysis identified a region on LDLR that assisted with disulfide shuffling in the upstream domain, thereby promoting its cotranslational folding. Thus, a plasma membrane-bound multidomain protein has evolved a sequence that promotes the nonvectorial folding of its upstream domains. These findings demonstrate that nonvectorial folding of a multidomain protein in the ER of mammalian cells is more coordinated and elaborated than previously thought. Thus, our findings alter our current view of how a multidomain protein folds nonvectorially in the ER of living cells.
Project description:Folding of small proteins often occurs in a two-state manner and is well understood both experimentally and theoretically. However, many proteins are much larger and often populate misfolded states, complicating their folding process significantly. Here we study the complete folding and assembly process of the 1,418 amino acid, dimeric chaperone Hsp90 using single-molecule optical tweezers. Although the isolated C-terminal domain shows two-state folding, we find that the isolated N-terminal as well as the middle domain populate ensembles of fast-forming, misfolded states. These intradomain misfolds slow down folding by an order of magnitude. Modeling folding as a competition between productive and misfolding pathways allows us to fully describe the folding kinetics. Beyond intradomain misfolding, folding of the full-length protein is further slowed by the formation of interdomain misfolds, suggesting that with growing chain lengths, such misfolds will dominate folding kinetics. Interestingly, we find that small stretching forces applied to the chain can accelerate folding by preventing the formation of cross-domain misfolding intermediates by leading the protein along productive pathways to the native state. The same effect is achieved by cotranslational folding at the ribosome in vivo.
Project description:The N-terminal regulatory domains of bacterial response regulator proteins catalyze phosphoryl transfer and function as phosphorylation-dependent regulatory switches to control the output activities of C-terminal effector domains. Structures of numerous isolated regulatory and effector domains have been determined. However, a detailed understanding of regulatory interactions among these domains has been limited by the relative paucity of structural data for intact multidomain response regulator proteins. The first multidomain structures determined, those of transcription factor NarL and methylesterase CheB, both revealed extensive interdomain interfaces. The regulatory domains obstruct access to the functional sites of the effector domains, indicating a regulatory mechanism based on inhibition. In contrast, the recently determined structure of the OmpR/PhoB homologue DrrD revealed no significant interdomain interface, suggesting that the domains are tethered by a flexible linker and lack a fixed orientation relative to each other. To address the generality of this feature, we have determined the 1.8-A resolution crystal structure of Thermotoga maritima DrrB, providing a second structure of a multidomain response regulator of the OmpR/PhoB subfamily. The structure reveals an extensive domain interface of 751 A(2) and therefore differs greatly from that observed in DrrD. Residues that are crucial players in defining the activation state of the regulatory domain contribute to this interface, implying that conformational changes associated with phosphorylation will influence these intramolecular contacts. The DrrB and DrrD structures are suggestive of different signaling mechanisms, with intramolecular communication between N- and C-terminal domains making substantially different contributions to effector domain regulation in individual members of the OmpR/PhoB family.
Project description:Changes at the cell surface enable bacteria to survive in dynamic environments, such as diverse niches of the human host. Here, we reveal "Periscope Proteins" as a widespread mechanism of bacterial surface alteration mediated through protein length variation. Tandem arrays of highly similar folded domains can form an elongated rod-like structure; thus, variation in the number of domains determines how far an N-terminal host ligand binding domain projects from the cell surface. Supported by newly available long-read genome sequencing data, we propose that this class could contain over 50 distinct proteins, including those implicated in host colonization and biofilm formation by human pathogens. In large multidomain proteins, sequence divergence between adjacent domains appears to reduce interdomain misfolding. Periscope Proteins break this "rule," suggesting that their length variability plays an important role in regulating bacterial interactions with host surfaces, other bacteria, and the immune system.