CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts.
ABSTRACT: The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, ?-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, ?-strands, coil regions, five common ?-turns (type I, II, I', II' and VIII), ? hairpins as well as interior and edge ?-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0.
Project description:BACKGROUND: A number of methods are now available to perform automatic assignment of periodic secondary structures from atomic coordinates, based on different characteristics of the secondary structures. In general these methods exhibit a broad consensus as to the location of most helix and strand core segments in protein structures. However the termini of the segments are often ill-defined and it is difficult to decide unambiguously which residues at the edge of the segments have to be included. In addition, there is a "twilight zone" where secondary structure segments depart significantly from the idealized models of Pauling and Corey. For these segments, one has to decide whether the observed structural variations are merely distorsions or whether they constitute a break in the secondary structure. METHODS: To address these problems, we have developed a method for secondary structure assignment, called KAKSI. Assignments made by KAKSI are compared with assignments given by DSSP, STRIDE, XTLSSTR, PSEA and SECSTR, as well as secondary structures found in PDB files, on 4 datasets (X-ray structures with different resolution range, NMR structures). RESULTS: A detailed comparison of KAKSI assignments with those of STRIDE and PSEA reveals that KAKSI assigns slightly longer helices and strands than STRIDE in case of one-to-one correspondence between the segments. However, KAKSI tends also to favor the assignment of several short helices when STRIDE and PSEA assign longer, kinked, helices. Helices assigned by KAKSI have geometrical characteristics close to those described in the PDB. They are more linear than helices assigned by other methods. The same tendency to split long segments is observed for strands, although less systematically. We present a number of cases of secondary structure assignments that illustrate this behavior. CONCLUSION: Our method provides valuable assignments which favor the regularity of secondary structure segments.
Project description:PONDEROSA (Peak-picking Of Noe Data Enabled by Restriction of Shift Assignments) accepts input information consisting of a protein sequence, backbone and sidechain NMR resonance assignments, and 3D-NOESY ((13)C-edited and/or (15)N-edited) spectra, and returns assignments of NOESY crosspeaks, distance and angle constraints, and a reliable NMR structure represented by a family of conformers. PONDEROSA incorporates and integrates external software packages (TALOS+, STRIDE and CYANA) to carry out different steps in the structure determination. PONDEROSA implements internal functions that identify and validate NOESY peak assignments and assess the quality of the calculated three-dimensional structure of the protein. The robustness of the analysis results from PONDEROSA's hierarchical processing steps that involve iterative interaction among the internal and external modules. PONDEROSA supports a variety of input formats: SPARKY assignment table (.shifts) and spectrum file formats (.ucsf), XEASY proton file format (.prot), and NMR-STAR format (.star). To demonstrate the utility of PONDEROSA, we used the package to determine 3D structures of two proteins: human ubiquitin and Escherichia coli iron-sulfur scaffold protein variant IscU(D39A). The automatically generated structural constraints and ensembles of conformers were as good as or better than those determined previously by much less automated means.The program, in the form of binary code along with tutorials and reference manuals, is available at http://ponderosa.nmrfam.wisc.edu/.
Project description:Deletion of Phe-508 (F508del) in the first nucleotide binding domain (NBD1) of the cystic fibrosis transmembrane conductance regulator (CFTR) leads to defects in folding and channel gating. NMR data on human F508del NBD1 indicate that an H620Q mutant, shown to increase channel open probability, and the dual corrector/potentiator CFFT-001 similarly disrupt interactions between ?-strands S3, S9, and S10 and the C-terminal helices H8 and H9, shifting a preexisting conformational equilibrium from helix to coil. CFFT-001 appears to interact with ?-strands S3/S9/S10, consistent with docking simulations. Decreases in T(m) from differential scanning calorimetry with H620Q or CFFT-001 suggest direct compound binding to a less thermostable state of NBD1. We hypothesize that, in full-length CFTR, shifting the conformational equilibrium to reduce H8/H9 interactions with the uniquely conserved strands S9/S10 facilitates release of the regulatory region from the NBD dimerization interface to promote dimerization and thereby increase channel open probability. These studies enabled by our NMR assignments for F508del NBD1 provide a window into the conformational fluctuations within CFTR that may regulate function and contribute to folding energetics.
Project description:Leucine rich repeats (LRRs) are present in over 100,000 proteins from viruses to eukaryotes. The LRRs are 20-30 residues long and occur in tandem. LRRs form parallel stacks of short ?-strands and then assume a super helical arrangement called a solenoid structure. Individual LRRs are separated into highly conserved segment (HCS) with the consensus of LxxLxLxxNxL and variable segment (VS). Eight classes have been recognized. Bacterial LRRs are short and characterized by two prolines in the VS; the consensus is xxLPxLPxx with Nine residues (N-subtype) and xxLPxxLPxx with Ten residues (T-subtype). Bacterial LRRs are contained in type III secretion system effectors such as YopM, IpaH3/9.8, SspH1/2, and SlrP from bacteria. Some LRRs in decorin, fribromodulin, TLR8/9, and FLRT2/3 from vertebrate also contain the motifs. In order to understand structural features of bacterial LRRs, we performed both secondary structures assignments using four programs-DSSP-PPII, PROSS, SEGNO, and XTLSSTR-and HELFIT analyses (calculating helix axis, pitch, radius, residues per turn, and handedness), based on the atomic coordinates of their crystal structures. The N-subtype VS adopts a left handed polyproline II helix (PPII) with four, five or six residues and a type I ?-turn at the C-terminal side. Thus, the N-subtype is characterized by a super secondary structure consisting of a PPII and a ?-turn. In contrast, the T-subtype VS prefers two separate PPIIs with two or three and two residues. The HELFIT analysis indicates that the type I ?-turn is a right handed helix. The HELFIT analysis determines three unit vectors of the helix axes of PPII (P), ?-turn (B), and LRR domain (A). Three structural parameters using these three helix axes are suggested to characterize the super secondary structure and the LRR domain.
Project description:In this article we present 1H and 13C chemical shift assignments, secondary structural propensity data and normalized temperature coefficient data for N-terminal peptides of Connexin 26 (Cx26), Cx26G12R and Cx32G12R mutants seen in syndromic deafness and Charcot Marie Tooth Disease respectively, published in "Structural Studies of N-Terminal Mutants of Connexin 26 and Connexin 32 Using 1H NMR Spectroscopy" (Y. Batir, T.A. Bargiello, T.L. Dowd, 2016) . The mutation G12R affects the structure of both Cx26 and Cx32 peptides differently. We present data from secondary structure propensity chemical shift analysis which calculates a secondary structure propensity (SSP) score for both disordered or folded peptides and proteins using the difference between the 13C secondary chemical shifts of the C? and C? protons. This data supplements the calculated NMR structures from NOESY data . We present and compare the SSP data for the Cx26 vs Cx26G12R peptides and the Cx32 and Cx32G12R peptides. In addition, we present plots of temperature coefficients obtained for Cx26, Cx26G12R and Cx32G12R peptides collected previously  and normalized to their random coil temperature coefficients, "Random coil 1H chemical shifts obtained as a function of temperature and trifluoroethanol concentration for the peptide series GGXGG" (G. Merutka, H.J. Dyson, P.E. Wright, 1995) . Reductions in these normalized temperature coefficients are directly observable for residues in different segments of the peptide and this data informs on solvent accessibility of the NH protons and NH protons which may be more constrained due to the formation of H bonds.
Project description:The ribosome is imprinted with a detailed molecular chronology of the origins and early evolution of proteins. Here we show that when arranged by evolutionary phase of ribosomal evolution, ribosomal protein (rProtein) segments reveal an atomic level history of protein folding. The data support a model in which aboriginal oligomers evolved into globular proteins in a hierarchical step-wise process. Complexity of assembly and folding of polypeptide increased incrementally in concert with expansion of rRNA. (i) Short random coil proto-peptides bound to rRNA, and (ii) lengthened over time and coalesced into ?-? secondary elements. These secondary elements (iii) accreted and collapsed, primarily into ?-domains. Domains (iv) accumulated and gained complex super-secondary structures composed of mixtures of ?-helices and ?-strands. Early protein evolution was guided and accelerated by interactions with rRNA. rRNA and proto-peptide provided mutual protection from chemical degradation and disassembly. rRNA stabilized polypeptide assemblies, which evolved in a stepwise process into globular domains, bypassing the immense space of random unproductive sequences. Coded proteins originated as oligomers and polymers created by the ribosome, on the ribosome and for the ribosome. Synthesis of increasingly longer products was iteratively coupled with lengthening and maturation of the ribosomal exit tunnel. Protein catalysis appears to be a late byproduct of selection for sophisticated and finely controlled assembly.
Project description:Coiled coil is a ubiquitous structural motif in proteins, with two to seven alpha helices coiled together like the strands of a rope, and coiled coil folding and assembly is not completely understood. A GCN4 leucine zipper mutant with four mutations of K3A, D7A, Y17W, and H18N has been designed, and the crystal structure has been determined at 1.6 A resolution. The peptide monomer shows a helix trunk with short curved N- and C-termini. In the crystal, two monomers cross in 35 degrees and form an X-shaped dimer, and each X-shaped dimer is welded into the next one through sticky hydrophobic ends, thus forming an extended two-stranded, parallel, super long coiled coil rather than a discrete, two-helix coiled coil of the wild-type GCN4 leucine zipper. Leucine residues appear at every seventh position in the super long coiled coil, suggesting that it is an extended super leucine zipper. Compared to the wild-type leucine zipper, the N-terminus of the mutant has a dramatic conformational change and the C-terminus has one more residue Glu 32 determined. The mutant X-shaped dimer has a large crossing angle of 35 degrees instead of 18 degrees in the wild-type dimer. The results show a novel assembly mode and oligomeric state of coiled coil, and demonstrate that mutations may affect folding and assembly of the overall coiled coil. Analysis of the formation mechanism of the super long coiled coil may help understand and design self-assembling protein fibers.
Project description:Determination of accurate resonance assignments from multidimensional chemical shift correlation spectra is one of the major problems in biomolecular solid state NMR, particularly for relative large proteins with less-than-ideal NMR linewidths. This article investigates the difficulty of resonance assignment, using a computational Monte Carlo/simulated annealing (MCSA) algorithm to search for assignments from artificial three-dimensional spectra that are constructed from the reported isotropic (15)N and (13)C chemical shifts of two proteins whose structures have been determined by solution NMR methods. The results demonstrate how assignment simulations can provide new insights into factors that affect the assignment process, which can then help guide the design of experimental strategies. Specifically, simulations are performed for the catalytic domain of SrtC (147 residues, primarily ?-sheet secondary structure) and the N-terminal domain of MLKL (166 residues, primarily ?-helical secondary structure). Assuming unambiguous residue-type assignments and four ideal three-dimensional data sets (NCACX, NCOCX, CONCA, and CANCA), uncertainties in chemical shifts must be less than 0.4 ppm for assignments for SrtC to be unique, and less than 0.2 ppm for MLKL. Eliminating CANCA data has no significant effect, but additionally eliminating CONCA data leads to more stringent requirements for chemical shift precision. Introducing moderate ambiguities in residue-type assignments does not have a significant effect.
Project description:We present a method that measures the accuracy of NMR protein structures. It compares random coil index [RCI] against local rigidity predicted by mathematical rigidity theory, calculated from NMR structures [FIRST], using a correlation score (which assesses secondary structure), and an RMSD score (which measures overall rigidity). We test its performance using: structures refined in explicit solvent, which are much better than unrefined structures; decoy structures generated for 89 NMR structures; and conventional predictors of accuracy such as number of restraints per residue, restraint violations, energy of structure, ensemble RMSD, Ramachandran distribution, and clashscore. Restraint violations and RMSD are poor measures of accuracy. Comparisons of NMR to crystal structures show that secondary structure is equally accurate, but crystal structures are typically too rigid in loops, whereas NMR structures are typically too floppy overall. We show that the method is a useful addition to existing measures of accuracy.
Project description:Local structures in denatured proteins may be important in guiding a polypeptide chain during the folding and misfolding processes. Existence of local structures in chemically denatured proteins is a highly controversial issue. NMR parameters [coupling constants (3) J(H(alpha),H(N)) and chemical shifts] of chemically denatured proteins in general deviate little from their values in small peptides. These peptides were presumed to be completely unstructured; therefore, it was considered that chemically denatured proteins are random coils. But recent experimental studies show that small peptides adopt relatively stable structures in aqueous solutions. Small deviations of the NMR parameters from their values in small peptides may thus actually indicate the existence of local structures in chemically denatured proteins. Using NMR data and theoretical predictions we show here that fluctuating beta-strands exist in urea-denatured ubiquitin (8 M urea at pH 2). Residues in such beta-strands populate more frequently the left side of the broad beta region of -psi space. Urea-denatured ubiquitin contains no detectable beta-sheet secondary structures; nevertheless, the fluctuating beta-strands in urea-denatured ubiquitin coincide to the beta-strands in the native state. Formation of beta-strands is in accord with the electrostatic screening model of unfolded proteins. The free energy of a residue in an unfolded protein is in this model determined by the local backbone electrostatics and its screening by backbone solvation. These energy terms introduce strong electrostatic coupling between neighboring residues, which causes cooperative formation of beta-strands in denatured proteins. We propose that fluctuating beta-strands in denatured proteins may serve as initiation sites to form fibrils.