Project description:The recent availability of long equilibrium simulations of protein folding in atomistic detail for more than 10 proteins allows us to identify the key interactions driving folding. We find that the collective fraction of native amino acid contacts, Q, captures remarkably well the transition states for all the proteins with a folding free energy barrier. Going beyond this global picture, we devise two different measures to quantify the importance of individual interresidue contacts in the folding mechanism: (i) the log-ratio of lifetimes of contacts during folding transition paths and in the unfolded state and (ii) a Bayesian measure of how predictive the formation of each contact is for being on a transition path. Both of these measures indicate that native, or near-native, contacts are important for determining mechanism, as might be expected. More remarkably, however, we found that for almost all the proteins, with the designed protein α3D being a notable exception, nonnative contacts play no significant part in determining folding mechanisms.
Project description:Tissue inhibitor of metalloproteinase 1 (TIMP-1) controls matrix metalloproteinase (MMP) activity through 1:1 stochiometric binding. Human TIMP-1 fused to a glycosylphosphatidylinositol (GPI) anchor (TIMP-1-GPI) shifts the activity of TIMP-1 from the extracellular matrix to the cell surface. TIMP-1-GPI treated renal cell carcinoma cells (RCC) show increased apoptosis and reduced proliferation. Transcriptomic profiling and regulatory pathway mapping were used to identify potential mechanisms driving these effects. Significant changes in inhibitor of DNA binding (IDs), TGF-β1/SMAD and BMP pathways resulted from TIMP-1-GPI treatment. These events were linked to reduced TGF-β1 signaling mediated by inhibition of proteolytic processing of latent TGF-β1 by TIMP-1-GPI. Activity of TIMP-1 from the extracellular matrix to the cell surface. TIMP-1-GPI treated renal cell carcinoma cells (RCC) show increased apoptosis and reduced proliferation. Transcriptomic profiling and regulatory pathway mapping were used to identify potential mechanisms driving these effects. Significant changes in inhibitor of DNA binding (IDs), TGF-β1/SMAD and BMP pathways resulted from TIMP-1-GPI treatment. These events were linked to reduced TGF-β1 signaling mediated by inhibition of proteolytic processing of latent TGF-β1 by TIMP-1-GPI. Renal cell carcinoma cells were transfected with empty vector, rhTimp1 and 2 concentrations of Timp1-GPI fusion protein
Project description:The insertion and folding of proteins into membranes is crucial for cell viability. Yet, the detailed contributions of insertases remain elusive. Here, we monitor how the insertase YidC guides the folding of the polytopic melibiose permease MelB into membranes. In vivo experiments using conditionally depleted E. coli strains show that MelB can insert in the absence of SecYEG if YidC resides in the cytoplasmic membrane. In vitro single-molecule force spectroscopy reveals that the MelB substrate itself forms two folding cores from which structural segments insert stepwise into the membrane. However, misfolding dominates, particularly in structural regions that interface the pseudo-symmetric α-helical domains of MelB. Here, YidC takes an important role in accelerating and chaperoning the stepwise insertion and folding process of both MelB folding cores. Our findings reveal a great flexibility of the chaperoning and insertase activity of YidC in the multifaceted folding processes of complex polytopic membrane proteins.
Project description:The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a diffusion-based generative model that generates protein backbone structures via a procedure inspired by the natural folding process. We describe a protein backbone structure as a sequence of angles capturing the relative orientation of the constituent backbone atoms, and generate structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins natively twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for more complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release an open-source codebase and trained models for protein structure diffusion.
Project description:C-reactive protein (CRP) is an acute phase reactant secreted by hepatocytes as a pentamer. The structure formation of pentameric CRP has been demonstrated to proceed in a stepwise manner in live cells. Here, we further dissect the sequence determinants that underlie the key steps in cellular folding and assembly of CRP. The initial folding of CRP subunits depends on a leading sequence with a conserved dipeptide that licenses the formation of the hydrophobic core. This drives the bonding of the intra-subunit disulfide requiring a favorable niche largely conferred by a single residue within the C-terminal helix. A conserved salt bridge then mediates the assembly of folded subunits into pentamer. The pentameric assembly harbors a pronounced plasticity in inter-subunit interactions, which may form the basis for a reversible activation of CRP in inflammation. These results provide insights into how sequence constraints are evolved to dictate structure and function of CRP.
Project description:Protein chaperones are essential in all domains of life to prevent and resolve protein misfolding during translation and proteotoxic stress. HSP70 family chaperones, including E. coli DnaK, function in stress induced protein refolding and degradation, but are dispensable for cellular viability due to redundant chaperone systems that prevent global nascent peptide insolubility. However, the function of HSP70 chaperones in mycobacteria, a genus that includes multiple human pathogens, has not been examined. We find that mycobacterial DnaK is essential for cell growth and required for native protein folding in Mycobacterium smegmatis. Loss of DnaK is accompanied by proteotoxic collapse characterized by the accumulation of insoluble newly synthesized proteins. DnaK is required for solubility of large multimodular lipid synthases, including the essential lipid synthase FASI, and DnaK loss is accompanied by disruption of membrane structure and increased cell permeability. Trigger Factor is nonessential and has a minor role in native protein folding that is only evident in the absence of DnaK. In unstressed cells, DnaK localizes to multiple, dynamic foci, but relocalizes to focal protein aggregates during stationary phase or upon expression of aggregating peptides. Mycobacterial cells restart cell growth after proteotoxic stress by isolating persistent DnaK containing protein aggregates away from daughter cells. These results reveal unanticipated essential nonredunant roles for mycobacterial DnaK in mycobacteria and indicate that DnaK defines a unique susceptibility point in the mycobacterial proteostasis network.
Project description:Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ?12 residues in crystal structures. However, InsEnds may contain as many as ~50 amino acids, and the template-based model of the protein itself may be imperfect. To address these challenges, we present a free modeling method for predicting the local structure of loops and large InsEnds in both crystal structures and template-based models. The approach uses single amino acid torsional angle "pivot" moves of the protein backbone with a C(?) level representation. Nevertheless, our accuracy for loops is comparable to existing methods. We also apply a more stringent test, the blind structure prediction and refinement categories of the CASP9 tournament, where we improve the quality of several homology based models by modeling InsEnds as long as 45 amino acids, sizes generally inaccessible to existing loop prediction methods. Our approach ranks as one of the best in the CASP9 refinement category that involves improving template-based models so that they can function as molecular replacement models to solve the phase problem for crystallographic structure determination.
Project description:To scrutinize how a protein folds at atomic resolution, we performed 200 molecular dynamics simulations (each of 50 ns) of the miniprotein Trp-cage on the computational grid. Within the trajectories, 58 folding and 31 unfolding events were identified and subjected to extensive comparison and classification. Based on an analogy with biological sequences, the folding and unfolding trajectories (arrays of sequential snapshots of structures) were aligned by dynamic programming allowing gaps. A phylogenetic tree derived from the alignments revealed four distinct groups of the trajectories, characterized by the Trp side-chain motions and the main-chain motions. It was found that only one group attained the native structure and that the other three led to pseudonative structures having the correct main-chain trace but different nonnative Trp side-chain rotamers, indicating that those four folded structures were each attained through a unique folding pathway.
Project description:Although the folding rates of proteins have been studied extensively, both experimentally and theoretically, and many native state topological parameters have been proposed to correlate with or predict these rates, unfolding rates have received much less attention. Moreover, unfolding rates have generally been thought either to not relate to native topology in the same manner as folding rates, perhaps depending on different topological parameters, or to be more difficult to predict. Using a dataset of 108 proteins including two-state and multistate folders, we find that both unfolding and folding rates correlate strongly, and comparably well, with well-established measures of native topology, the absolute contact order and the long range order, with correlation coefficient values of 0.75 or higher. In addition, compared to folding rates, the absolute values of unfolding rates vary more strongly with native topology, have a larger range of values, and correlate better with thermodynamic stability. Similar trends are observed for subsets of different protein structural classes. Taken together, these results suggest that choosing a scaffold for protein engineering may require a compromise between a simple topology that will fold sufficiently quickly but also unfold quickly, and a complex topology that will unfold slowly and hence have kinetic stability, but fold slowly. These observations, together with the established role of kinetic stability in determining resistance to thermal and chemical denaturation as well as proteases, have important implications for understanding fundamental aspects of protein unfolding and folding and for protein engineering and design.
Project description:Understanding, and ultimately predicting, how a 1-D protein chain reaches its native 3-D fold has been one of the most challenging problems during the last few decades. Data increasingly indicate that protein folding is a hierarchical process. Hence, the question arises as to whether we can use the hierarchical concept to reduce the practically intractable computational times. For such a scheme to work, the first step is to cut the protein sequence into fragments that form local minima on the polypeptide chain. The conformations of such fragments in solution are likely to be similar to those when the fragments are embedded in the native fold, although alternate conformations may be favored during the mutual stabilization in the combinatorial assembly process. Two elements are needed for such cutting: (1) a library of (clustered) fragments derived from known protein structures and (2) an assignment algorithm that selects optimal combinations to "cover" the protein sequence. The next two steps in hierarchical folding schemes, not addressed here, are the combinatorial assembly of the fragments and finally, optimization of the obtained conformations. Here, we address the first step in a hierarchical protein-folding scheme. The input is a target protein sequence and a library of fragments created by clustering building blocks that were generated by cutting all protein structures. The output is a set of cutout fragments. We briefly outline a graph theoretic algorithm that automatically assigns building blocks to the target sequence, and we describe a sample of the results we have obtained.