Single-molecule fluorescence reveals sequence-specific misfolding in multidomain proteins.
ABSTRACT: A large range of debilitating medical conditions is linked to protein misfolding, which may compete with productive folding particularly in proteins containing multiple domains. Seventy-five per cent of the eukaryotic proteome consists of multidomain proteins, yet it is not understood how interdomain misfolding is avoided. It has been proposed that maintaining low sequence identity between covalently linked domains is a mechanism to avoid misfolding. Here we use single-molecule Förster resonance energy transfer to detect and quantify rare misfolding events in tandem immunoglobulin domains from the I band of titin under native conditions. About 5.5 per cent of molecules with identical domains misfold during refolding in vitro and form an unexpectedly stable state with an unfolding half-time of several days. Tandem arrays of immunoglobulin-like domains in humans show significantly lower sequence identity between neighbouring domains than between non-adjacent domains. In particular, the sequence identity of neighbouring domains has been found to be preferentially below 40 per cent. We observe no misfolding for a tandem of naturally neighbouring domains with low sequence identity (24 per cent), whereas misfolding occurs between domains that are 42 per cent identical. Coarse-grained molecular simulations predict the formation of domain-swapped structures that are in excellent agreement with the observed transfer efficiency of the misfolded species. We infer that the interactions underlying misfolding are very specific and result in a sequence-specific domain-swapping mechanism. Diversifying the sequence between neighbouring domains seems to be a successful evolutionary strategy to avoid misfolding in multidomain proteins.
Project description:Neighbouring domains of multidomain proteins with homologous tandem repeats have divergent sequences, probably as a result of evolutionary pressure to avoid misfolding and aggregation, particularly at the high cellular protein concentrations. Here we combine microfluidic-mixing single-molecule kinetics, ensemble experiments and molecular simulations to investigate how misfolding between the immunoglobulin-like domains of titin is prevented. Surprisingly, we find that during refolding of tandem repeats, independent of sequence identity, more than half of all molecules transiently form a wide range of misfolded conformations. Simulations suggest that a large fraction of these misfolds resemble an intramolecular amyloid-like state reported in computational studies. However, for naturally occurring neighbours with low sequence identity, these transient misfolds disappear much more rapidly than for identical neighbours. We thus propose that evolutionary sequence divergence between domains is required to suppress the population of long-lived, potentially harmful misfolded states, whereas large populations of transient misfolded states appear to be tolerated.
Project description:Frustration from strong interdomain interactions can make misfolding a more severe problem in multidomain proteins than in single-domain proteins. On the basis of bioinformatic surveys, it has been suggested that lowering the sequence identity between neighboring domains is one of nature's solutions to the multidomain misfolding problem. We investigate folding of multidomain proteins using the associative-memory, water-mediated, structure and energy model (AWSEM), a predictive coarse-grained protein force field. We find that reducing sequence identity not only decreases the formation of domain-swapped contacts but also decreases the formation of strong self-recognition contacts between ?-strands with high hydrophobic content. The ensembles of misfolded structures that result from forming these amyloid-like interactions are energetically disfavored compared with the native state, but entropically favored. Therefore, these ensembles are more stable than the native ensemble under denaturing conditions, such as high temperature. Domain-swapped contacts compete with self-recognition contacts in forming various trapped states, and point mutations can shift the balance between the two types of interaction. We predict that multidomain proteins that lack these specific strong interdomain interactions should fold reliably.
Project description:Recent single molecule experiments, using either atomic force microscopy (AFM) or Förster resonance energy transfer (FRET) have shown that multidomain proteins containing tandem repeats may form stable misfolded structures. Topology-based simulation models have been used successfully to generate models for these structures with domain-swapped features, fully consistent with the available data. However, it is also known that some multidomain protein folds exhibit no evidence for misfolding, even when adjacent domains have identical sequences. Here we pose the question: what factors influence the propensity of a given fold to undergo domain-swapped misfolding? Using a coarse-grained simulation model, we can reproduce the known propensities of multidomain proteins to form domain-swapped misfolds, where data is available. Contrary to what might be naively expected based on the previously described misfolding mechanism, we find that the extent of misfolding is not determined by the relative folding rates or barrier heights for forming the domains present in the initial intermediates leading to folded or misfolded structures. Instead, it appears that the propensity is more closely related to the relative stability of the domains present in folded and misfolded intermediates. We show that these findings can be rationalized if the folded and misfolded domains are part of the same folding funnel, with commitment to one structure or the other occurring only at a relatively late stage of folding. Nonetheless, the results are still fully consistent with the kinetic models previously proposed to explain misfolding, with a specific interpretation of the observed rate coefficients. Finally, we investigate the relation between interdomain linker length and misfolding, and propose a simple alchemical model to predict the propensity for domain-swapped misfolding of multidomain proteins.
Project description:Spectrin is a multidomain cytoskeletal protein, the component three-helix bundle domains are expected to experience mechanical force in vivo. In thermodynamic and kinetic studies, neighboring domains of chicken brain alpha-spectrin R16 and R17 have been shown to behave cooperatively. Is this cooperativity maintained under force? The effect of force on these spectrin domains was investigated using atomic force microscopy. The response of the individual domains to force was compared to that of the tandem repeat R1617. Importantly, nonhelical linkers (all-beta immunoglobulin domains) were used to avoid formation of nonnative helical linkers. We show that, in contrast to previous studies on spectrin repeats, only 3% of R1617 unfolding events gave an increase in contour length consistent with cooperative two-domain unfolding events. Furthermore, the unfolding forces for R1617 were the same as those for the unfolding of R16 or R17 alone. This is a strong indication that the cooperative unfolding behavior observed in the stopped-flow studies is absent between these spectrin domains when force is acting as a denaturant. Our evidence suggests that the rare double unfolding events result from misfolding between adjacent repeats. We suggest that this switch from cooperative to independent behavior allows multidomain proteins to maintain integrity under applied force.
Project description:Tandem homologous domains in proteins are susceptible to misfolding through the formation of domain swaps, non-native conformations involving the exchange of equivalent structural elements between adjacent domains. Cutting-edge biophysical experiments have recently allowed the observation of tandem domain swapping events at the single molecule level. In addition, computer simulations have shed light into the molecular mechanisms of domain swap formation and serve as the basis for methods to systematically predict them. At present, the number of studies on tandem domain swaps is still small and limited to a few domain folds, but they offer important insights into the folding and evolution of multidomain proteins with applications in the field of protein design.
Project description:<h4>Background</h4>Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate.<h4>Results</h4>This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude.<h4>Conclusion</h4>Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution.
Project description:Titin is a giant filamentous polypeptide of multidomain construction spanning between the Z- and M-lines of the cardiac muscle sarcomere. Extension of the I-band segment of titin gives rise to a force that underlies part of the diastolic force of cardiac muscle. Titin's force arises from its extensible I-band region, which consists of two main segment types: serially linked immunoglobulin-like domains (tandem Ig segments) interrupted with a proline (P)-, glutamate (E)-, valine (V)-, and lysine (K)-rich segment called PEVK segment. In addition to these segments, the extensible region of cardiac titin also contains a unique 572-residue sequence that is part of the cardiac-specific N2B element. In this work, immunoelectron microscopy was used to study the molecular origin of the in vivo extensibility of the I-band region of cardiac titin. The extensibility of the tandem Ig segments, the PEVK segment, and that of the unique N2B sequence were studied, using novel antibodies against Ig domains that flank these segments. Results show that only the tandem Igs extend at sarcomere lengths (SLs) below approximately 2.0 microm, and that, at longer SLs, the PEVK and the unique sequence extend as well. At the longest SLs that may be reached under physiological conditions ( approximately 2.3 microm), the PEVK segment length is approximately 50 nm whereas the unique N2B sequence is approximately 80 nm long. Thus, the unique sequence provides additional extensibility to cardiac titins and this may eliminate the necessity for unfolding of Ig domains under physiological conditions. In summary, this work provides direct evidence that the three main molecular subdomains of N2B titin are all extensible and that their contribution to extensibility decreases in the order of tandem Igs, unique N2B sequence, and PEVK segment.
Project description:The study of the folding of single domains, in the context of their multidomain environment, is important because more than 70% of eukaryotic proteins are composed of multiple domains. The structures of the tandem immunoglobulin (Ig) domain pairs A164-A165 and A168-A169, from the A-band of the giant muscle protein titin, reveal that they form tightly associated domain arrangements, connected by a continuous ?-strand. We investigate the thermodynamic and kinetic properties of these tandem domain pairs. While A164-A165 apparently behaves as a single cooperative unit at equilibrium, unfolding without the accumulation of a large population of intermediates, domains in A168-A169 behave independently. Although A169 appears to be stabilized in the tandem protein, we show that this is due to nonspecific stabilization by extension. We elucidate the folding and unfolding pathways of both tandem pairs and show that cooperativity in A164-A165 is a manifestation of the relative refolding and unfolding rate constants of each individual domain. We infer that the differences between the two tandem pairs result from a different pattern of interactions at the domain/domain interface.
Project description:Studying the effects of pathogenic mutations is more complex in multidomain proteins when compared with single domains: mutations occurring at domain boundaries may have a large effect on a neighbouring domain that will not be detected in a single-domain system. To demonstrate this, we present a study that utilizes well-characterized model protein domains from human spectrin to investigate the effect of disease- and non-disease-causing single point mutations occurring at the boundaries of human spectrin repeats. Our results show that mutations in the single domains have no clear correlation with stability and disease; however, when studied in a tandem model system, the disease-causing mutations are shown to disrupt stabilizing interactions that exist between domains. This results in a much larger decrease in stability than would otherwise have been predicted, and demonstrates the importance of studying such mutations in the correct protein context.
Project description:BACKGROUND: The State of Bavaria is involved in a research program that will lead to the construction of a DNA barcode library for all animal species within its territorial boundaries. The present study provides a comprehensive DNA barcode library for the Geometridae, one of the most diverse of insect families. METHODOLOGY/PRINCIPAL FINDINGS: This study reports DNA barcodes for 400 Bavarian geometrid species, 98 per cent of the known fauna, and approximately one per cent of all Bavarian animal species. Although 98.5% of these species possess diagnostic barcode sequences in Bavaria, records from neighbouring countries suggest that species-level resolution may be compromised in up to 3.5% of cases. All taxa which apparently share barcodes are discussed in detail. One case of modest divergence (1.4%) revealed a species overlooked by the current taxonomic system: Eupithecia goossensiata Mabille, 1869 stat.n. is raised from synonymy with Eupithecia absinthiata (Clerck, 1759) to species rank. Deep intraspecific sequence divergences (>2%) were detected in 20 traditionally recognized species. CONCLUSIONS/SIGNIFICANCE: The study emphasizes the effectiveness of DNA barcoding as a tool for monitoring biodiversity. Open access is provided to a data set that includes records for 1,395 geometrid specimens (331 species) from Bavaria, with 69 additional species from neighbouring regions. Taxa with deep intraspecific sequence divergences are undergoing more detailed analysis to ascertain if they represent cases of cryptic diversity.