Molecular Determinants of Mutant Phenotypes, Inferred from Saturation Mutagenesis Data.
ABSTRACT: Understanding how mutations affect protein activity and organismal fitness is a major challenge. We used saturation mutagenesis combined with deep sequencing to determine mutational sensitivity scores for 1,664 single-site mutants of the 101 residue Escherichia coli cytotoxin, CcdB at seven different expression levels. Active-site residues could be distinguished from buried ones, based on their differential tolerance to aliphatic and charged amino acid substitutions. At nonactive-site positions, the average mutational tolerance correlated better with depth from the protein surface than with accessibility. Remarkably, similar results were observed for two other small proteins, PDZ domain (PSD95pdz3) and IgG-binding domain of protein G (GB1). Mutational sensitivity data obtained with CcdB were used to derive a procedure for predicting functional effects of mutations. Results compared favorably with those of two widely used computational predictors. In vitro characterization of 80 single, nonactive-site mutants of CcdB showed that activity in vivo correlates moderately with thermal stability and solubility. The inability to refold reversibly, as well as a decreased folding rate in vitro, is associated with decreased activity in vivo. Upon probing the effect of modulating expression of various proteases and chaperones on mutant phenotypes, most deleterious mutants showed an increased in vivo activity and solubility only upon over-expression of either Trigger factor or SecB ATP-independent chaperones. Collectively, these data suggest that folding kinetics rather than protein stability is the primary determinant of activity in vivo This study enhances our understanding of how mutations affect phenotype, as well as the ability to predict fitness effects of point mutations.
Project description:Plasmodium falciparum thymidylate synthase-dihydrofolate reductase (TS-DHFR) is an essential enzyme in nucleotide biosynthesis and a validated molecular drug target in malaria. Because P. falciparum TS and DHFR are highly homologous to their human counterparts, existing active-site antifolate drugs can have dose-limiting toxicities. In humans, TS and DHFR are two separate proteins. In P. falciparum, however, TS-DHFR is bifunctional, with both TS and DHFR active sites on a single polypeptide chain of the enzyme. Consequently, P. falciparum TS-DHFR contains unique distant or nonactive regions that might modulate catalysis: (1) an N-terminal tail and (2) a linker region tethering DHFR to TS, and encoding a crossover helix that forms critical electrostatic interactions with the DHFR active site. The role of these nonactive sites in the bifunctional P. falciparum TS-DHFR is unknown. We report the first in-depth, pre-steady-state kinetic characterization of the full-length, wild-type (WT) P. falciparum TS-DHFR enzyme and probe the role of distant, nonactive regions through mutational analysis. We show that the overall rate-limiting step in the WT P. falciparum TS-DHFR enzyme is TS catalysis. We further show that if TS is in an activated (liganded) conformation, the DHFR rate is 2-fold activated, from 60 s-1 to 130 s-1 in the WT enzyme. The TS rate is also reciprocally activated by approximately 1.5-fold if DHFR is in an activated, ligand-bound conformation. Mutations to the linker region affect neither catalytic rate nor domain-domain communication. Deletion of the N-terminal tail, although in a location remote from the active site, decreases the DHFR single rate and the bifunctional TS-DHFR rate by a factor of 2. The 2-fold activation of the DHFR rate by TS ligands remains intact, although even the activated N-terminal mutant has just half the DHFR activity of the WT enzyme. However, the reciprocal communication between TS active site and DHFR ligands is impaired in N-terminal mutants. Surprisingly, deletion of the analogous N-terminal tail in Leishmania major TS-DHFR causes a 3-fold enhancement of the DHFR rate from approximately 14 s-1 to approximately 40 s-1. In summary, our results demonstrate a complex interplay of domain-domain communication and nonactive-site modulation of catalysis in P. falciparum TS-DHFR. Furthermore, each parasitic TS-DHFR is activated by unique mechanisms, modulated by their nonactive site regions. Finally, our studies suggest the N-terminal tail of P. falciparum TS-DHFR is a highly selective, novel target for potential antifolate development in malaria.
Project description:When incorporated into a polypeptide chain, proline (Pro) differs from all other naturally occurring amino acid residues in two important respects. The phi dihedral angle of Pro is constrained to values close to -65 degrees and Pro lacks an amide hydrogen. Consequently, mutations which result in introduction of Pro can significantly affect protein stability. In the present work, we describe a procedure to accurately predict the effect of Pro introduction on protein thermodynamic stability. Seventy-seven of the 97 non-Pro amino acid residues in the model protein, CcdB, were individually mutated to Pro, and the in vivo activity of each mutant was characterized. A decision tree to classify the mutation as perturbing or nonperturbing was created by correlating stereochemical properties of mutants to activity data. The stereochemical properties including main chain dihedral angle phi and main chain amide H-bonds (hydrogen bonds) were determined from 3D models of the mutant proteins built using MODELLER. We assessed the performance of the decision tree on a large dataset of 163 single-site Pro mutations of T4 lysozyme, 74 nsSNPs, and 52 other Pro substitutions from the literature. The overall accuracy of this algorithm was found to be 81% in the case of CcdB, 77% in the case of lysozyme, 76% in the case of nsSNPs, and 71% in the case of other Pro substitution data. The accuracy of Pro scanning mutagenesis for secondary structure assignment was also assessed and found to be at best 69%. Our prediction procedure will be useful in annotating uncharacterized nsSNPs of disease-associated proteins and for protein engineering and design.
Project description:Temperature-sensitive (Ts) mutants are a powerful tool with which to study gene function in vivo. Ts mutants are typically generated by random mutagenesis followed by laborious screening procedures. By using the Escherichia coli cytotoxin CcdB as a model system, simple procedures for generating Ts mutants at high frequency through site-directed mutagenesis were developed. Putative buried, hydrophobic residues are selected through analysis of the protein sequence. Residue burial is confirmed by ensuring that substitution of the residue by Asp leads to protein inactivation. At such sites, a Ts phenotype can typically be generated either by (i) substitution of two predicted, buried residues with the 18 remaining amino acids or (ii) introduction of Lys, Ser, Ala, and Trp at three to four predicted buried sites. By using these design strategies, 17 tight Ts mutants of CcdB were isolated at four predicted buried sites. The rules were further verified by making several Ts mutants of yeast Gal4 at residues 68, 69, and 70. No Ts mutants of either protein have been previously reported. Such Ts mutants of Gal4 can be used for conditional expression of a variety of genes by using the well characterized upstream-activating-sequence-Gal4 system.
Project description:Human liver cytosolic and mitochondrial isozymes of aldehyde dehydrogenase share 70% sequence identity. However, the first 21 residues are not conserved between the human isozymes (15% identity). The three-dimensional structures of the beef mitochondrial and sheep cytosolic forms have virtually identical three-dimensional structures. Here, we solved the structure of the human mitochondrial enzyme and found it to be identical to the beef enzyme. The first 21 residues are found on the surface of the enzyme and make no contact with other subunits in the tetramer. A pair of chimeric enzymes between the human isozymes was made. Each chimera had the first 21 residues from one isozyme and the remaining 479 from the other. When the first 21 residues were from the mitochondrial isozyme, an enzyme with cytosolic-like properties was produced. The other was expressed but was insoluble. It was possible to restore solubility and activity to the chimera that had the first 21 cytosolic residues fused to the mitochondrial ones by making point mutations to residues at the N-terminal end. When residue 19 was changed from tyrosine to a cysteine, the residue found in the mitochondrial form, an active enzyme could be made though the Km for NAD+ was 35 times higher than the native mitochondrial isozyme and the specific activity was reduced by 75%. This residue interacts with residue 203, a nonconserved, nonactive site residue. A mutation of residue 18, which also interacts with 203, restored solubility, but not activity. Mutation to residue 15, which interacts with 104, also restored solubility but not activity. It appears that to have a soluble or active enzyme a favorable interaction must occur between a residue in a surface loop and a residue elsewhere in the molecule even though neither make contact with the active site region of the enzyme.
Project description:Toxin-antitoxin (TA) systems are found on both bacterial plasmids and chromosomes, but in most cases their functional role is unclear. Gene knockouts often yield limited insights into functions of individual TA systems because of their redundancy. The well-characterized F-plasmid-based CcdAB TA system is important for F-plasmid maintenance. We have isolated several point mutants of the toxin CcdB that fail to bind to its cellular target, DNA gyrase, but retain binding to the antitoxin, CcdA. Expression of such mutants is shown to result in release of the WT toxin from a functional preexisting TA complex as well as derepression of the TA operon. One such inactive, active-site mutant of CcdB was used to demonstrate the contribution of CcdB to antibiotic persistence. Transient activation of WT CcdB either by coexpression of the mutant or by antibiotic/heat stress was shown to enhance the generation of drug-tolerant persisters in a process dependent on Lon protease and RecA. An F-plasmid containing a ccd locus can, therefore, function as a transmissible persistence factor.
Project description:Proteins are marginally stable, and an understanding of the sequence determinants for improved protein solubility is highly desired. For enzymes, it is well known that many mutations that increase protein solubility decrease catalytic activity. These competing effects frustrate efforts to design and engineer stable, active enzymes without laborious high-throughput activity screens. To address the trade-off between enzyme solubility and activity, we performed deep mutational scanning using two different screens/selections that purport to gauge protein solubility for two full-length enzymes. We assayed a TEM-1 beta-lactamase variant and levoglucosan kinase (LGK) using yeast surface display (YSD) screening and a twin-arginine translocation pathway selection. We then compared these scans with published experimental fitness landscapes. Results from the YSD screen could explain 37% of the variance in the fitness landscapes for one enzyme. Five percent to 10% of all single missense mutations improve solubility, matching theoretical predictions of global protein stability. For a given solubility-enhancing mutation, the probability that it would retain wild-type fitness was correlated with evolutionary conservation and distance to active site, and anticorrelated with contact number. Hybrid classification models were developed that could predict solubility-enhancing mutations that maintain wild-type fitness with an accuracy of 90%. The downside of using such classification models is the removal of rare mutations that improve both fitness and solubility. To reveal the biophysical basis of enhanced protein solubility and function, we determined the crystallographic structure of one such LGK mutant. Beyond fundamental insights into trade-offs between stability and activity, these results have potential biotechnological applications.
Project description:BACKGROUND: Expression and purification of correctly folded proteins typically require screening of different parameters such as protein variants, solubility enhancing tags or expression hosts. Parallel vector series that cover all variations are available, but not without compromise. We have established a fast, efficient and absolutely background free cloning approach that can be applied to any selected vector. RESULTS: Here we describe a method to tailor selected expression vectors for parallel Sequence and Ligation Independent Cloning. SLIC cloning enables precise and sequence independent engineering and is based on joining vector and insert with 15-25 bp homologies on both DNA ends by homologous recombination. We modified expression vectors based on pET, pFastBac and pTT backbones for parallel PCR-based cloning and screening in E.coli, insect cells and HEK293E cells, respectively. We introduced the toxic ccdB gene under control of a strong constitutive promoter for counterselection of insert less vector. In contrast to DpnI treatment commonly used to reduce vector background, ccdB used in our vector series is 100% efficient in killing parental vector carrying cells and reduces vector background to zero. In addition, the 3' end of ccdB functions as a primer binding site common to all vectors. The second shared primer binding site is provided by a HRV 3C protease cleavage site located downstream of purification and solubility enhancing tags for tag removal. We have so far generated more than 30 different parallel expression vectors, and successfully cloned and expressed more than 250 genes with this vector series. There is no size restriction for gene insertion, clone efficiency is > 95% with clone numbers up to 200. The procedure is simple, fast, efficient and cost-effective. All expression vectors showed efficient expression of eGFP and different target proteins requested to be produced and purified at our Core Facility services. CONCLUSION: This new expression vector series allows efficient and cost-effective parallel cloning and thus screening of different protein constructs, tags and expression hosts.
Project description:Identification of residue-residue contacts from primary sequence can be used to guide protein structure prediction. Using Escherichia coli CcdB as the test case, we describe an experimental method termed saturation-suppressor mutagenesis to acquire residue contact information. In this methodology, for each of five inactive CcdB mutants, exhaustive screens for suppressors were performed. Proximal suppressors were accurately discriminated from distal suppressors based on their phenotypes when present as single mutants. Experimentally identified putative proximal pairs formed spatial constraints to recover >98% of native-like models of CcdB from a decoy dataset. Suppressor methodology was also applied to the integral membrane protein, diacylglycerol kinase A where the structures determined by X-ray crystallography and NMR were significantly different. Suppressor as well as sequence co-variation data clearly point to the X-ray structure being the functional one adopted in vivo. The methodology is applicable to any macromolecular system for which a convenient phenotypic assay exists.
Project description:Every residue of the 101-aa Escherichia coli toxin CcdB was substituted with Ala, Asp, Glu, Lys, and Arg by using site-directed mutagenesis. The activity of each mutant in vivo was characterized as a function of Controller of Cell Division or Death B protein (CcdB) transcriptional level. The mutation data suggest that an accessibility value of 5% is an appropriate cutoff for definition of buried residues. At all buried positions, introduction of Asp results in an inactive phenotype at all CcdB transcriptional levels. The average amount of destabilization upon substitution at buried positions decreases in the order Asp>Glu>Lys>Arg>Ala. Asp substitutions at buried sites in two other proteins, maltose-binding protein and thioredoxin, also were shown to be severely destabilizing. Ala and Asp scanning mutagenesis, in combination with dose-dependent expression phenotypes, was shown to yield important information on protein structure and activity. These results also suggest that such scanning mutagenesis data can be used to rank order sequence alignments and their corresponding homology models, as well as to distinguish between correct and incorrect structural alignments. With continuous reductions in oligonucleotide costs and increasingly efficient site-directed mutagenesis procedures, comprehensive scanning mutagenesis experiments for small proteins/domains are quite feasible.
Project description:Protein quality control is accomplished by inducing chaperones and proteases in response to an altered cellular folding state. In Escherichia coli, expression of chaperones and proteases is positively regulated by sigma32. Chaperone-mediated negative feedback control of sigma32 activity allows this transcription factor to sense the cellular folding state. We identified point mutations in sigma32 altered in feedback control. Surprisingly, such mutants are resistant to inhibition by both the DnaK/J and GroEL/S chaperones in vivo and also show dramatically increased stability. Further characterization of the most defective mutant revealed that it has almost normal binding to chaperones and RNA polymerase and is competent for chaperone-mediated inactivation in vitro. We suggest that the mutants identify a regulatory step downstream of chaperone binding that is required for both inactivation and degradation of sigma32.