Heterologous expression of L. major proteins in S. cerevisiae: a test of solubility, purity, and gene recoding.
ABSTRACT: High level expression of many eukaryotic proteins for structural analysis is likely to require a eukaryotic host since many proteins are either insoluble or lack essential post-translational modifications when expressed in E. coli. The well-studied eukaryote Saccharomyces cerevisiae possesses several attributes of a good expression host: it is simple and inexpensive to culture, has proven genetic tractability, and has excellent recombinant DNA tools. We demonstrate here that this yeast exhibits three additional characteristics that are desirable in a eukaryotic expression host. First, expression in yeast significantly improves the solubility of proteins that are expressed but insoluble in E. coli. The expression and solubility of 83 Leishmania major ORFs were compared in S. cerevisiae and in E. coli, with the result that 42 of the 64 ORFs with good expression and poor solubility in E. coli are highly soluble in S. cerevisiae. Second, the yield and purity of heterologous proteins expressed in yeast is sufficient for structural analysis, as demonstrated with both small scale purifications of 21 highly expressed proteins and large scale purifications of 2 proteins, which yield highly homogeneous preparations. Third, protein expression can be improved by altering codon usage, based on the observation that a codon-optimized construct of one ORF yields three-fold more protein. Thus, these results provide direct verification that high level expression and purification of heterologous proteins in S. cerevisiae is feasible and likely to improve expression of proteins whose solubility in E. coli is poor.
Project description:Insoluble recombinant proteins are a major issue for both structural genomics and enzymology research. Greater than 30% of recombinant proteins expressed in Escherichia coli (E. coli) appear to be insoluble. The prevailing view is that insolubly expressed proteins cannot be easily solubilized, and are usually sequestered into inclusion bodies. However, we hypothesize that small molecules added during the cell lysis stage can yield soluble protein from insoluble protein previously screened without additives or ligands. We present a novel screening method that utilized 144 additive conditions to increase the solubility of recombinant proteins expressed in E. coli. These selected additives are natural ligands, detergents, salts, buffers, and chemicals that have been shown to increase the stability of proteins in vivo. We present the methods used for this additive solubility screen and detailed results for 41 potential drug target recombinant proteins from infectious organisms. Increased solubility was observed for 80% of the recombinant proteins during the primary and secondary screening of lysis with the additives; that is 33 of 41 target proteins had increased solubility compared with no additive controls. Eleven additives (trehalose, glycine betaine, mannitol, L-Arginine, potassium citrate, CuCl(2), proline, xylitol, NDSB 201, CTAB and K(2)PO(4)) solubilized more than one of the 41 proteins; these additives can be easily screened to increase protein solubility. Large-scale purifications were attempted for 15 of the proteins using the additives identified and eight (40%) were prepared for crystallization trials during the first purification attempt. Thus, this protocol allowed us to recover about a third of seemingly insoluble proteins for crystallography and structure determination. If recombinant proteins are required in smaller quantities or less purity, the final success rate may be even higher.
Project description:Recombinant expression of proteins of interest in Escherichia coli is an important tool in the determination of protein structure. However, lack of expression and insolubility remain significant challenges to the expression and crystallization of these proteins. The SSGCID program uses a wheat germ cell-free expression system as a rescue pathway for proteins that are either not expressed or insoluble when produced in E. coli. Testing indicates that the system is a valuable tool for these protein targets. Further increases in solubility were obtained by the addition of the NVoy polymer reagent to the reaction mixture. These data indicate that this eukaryotic cell-free expression system has a high success rate and that the addition of specific reagents can increase the yield of soluble protein.
Project description:BACKGROUND: Signalling proteins often contain several well defined and conserved protein domains. Structural analyses of such domains by nuclear magnetic spectroscopy or X-ray crystallography may greatly inform the function of proteins. A limiting step is often the production of sufficient amounts of the recombinant protein. However, there is no particular way to predict whether a protein will be soluble when expressed in E.coli. Here we report our experience with expression of a Src homology 2 (SH2) domain. RESULTS: The SH2 domain of the SH2D2A protein (or T cell specific adapter protein, TSAd) forms insoluble aggregates when expressed as various GST-fusion proteins in Escherichia coli (E. coli). Alteration of the flanking sequences, or growth temperature influenced expression and solubility of TSAd-SH2, however overall yield of soluble protein remained low. The algorithm TANGO, which predicts amyloid fibril formation in eukaryotic cells, identified a hydrophobic sequence within the TSAd-SH2 domain with high propensity for beta-aggregation. Mutation to the corresponding amino acids of the related HSH2- (or ALX) SH2 domain increased the yield of soluble TSAd-SH2 domains. High beta-aggregation values predicted by TANGO correlated with low solubility of recombinant SH2 domains as reported in the literature. CONCLUSIONS: Solubility of recombinant proteins expressed in E.coli can be predicted by TANGO, an algorithm developed to determine the aggregation propensity of peptides. Targeted mutations representing corresponding amino acids in similar protein domains may increase solubility of recombinant proteins.
Project description:The protein sequences of three known RNA 2'-O-ribose methylases were used as probes for detecting putative homologs through iterative searches of genomic databases. We have identified 45 new positive Open Reading Frames (ORFs), mostly in prokaryotic genomes. Five complete eukaryotic ORFs were also detected, among which was a single ORF (YDL112w) in the yeast Saccharomyces cerevisiae genome. After genetic depletion of YDL112w, we observed a specific defect in tRNA ribose methylation, with the complete disappearance of Gm18 in all tRNAs that naturally contain this modification, whereas other tRNA ribose methylations and the complex pattern of rRNA ribose methylations were not affected. The tRNA G18 methylation defect was suppressed by transformation of the disrupted strain with a plasmid allowing expression of YDL112wp. The formation of Gm18 on an in vitro transcript of a yeast tRNASer naturally containing this methylation, which was efficiently catalyzed by cell-free extracts from the wild-type yeast strain, did not occur with extracts from the disrupted strain. The protein encoded by the YDL112w ORF, termed Trm3 (tRNA methylation), is therefore likely to be the tRNA (Gm18) ribose methylase. In in vitro assays, its activity is strongly dependent on tRNA architecture. Trm3p, the first putative tRNA ribose methylase identified in an eukaryotic organism, is considerably larger than its Escherichia coli functional homolog spoU (1,436 amino acids vs. 229 amino acids), or any known or putative prokaryotic RNA ribose methyltransferase. Homologs found in human (TRP-185 protein), Caenorhabditis elegans and Arabidopsis thaliana also exhibit a very long N-terminal extension not related to any protein sequence in databases.
Project description:To produce large quantities of high quality eukaryotic membrane proteins in Saccharomyces cerevisiae, we modified a high-copy vector to express membrane proteins C-terminally-fused to a Tobacco Etch Virus (TEV) protease detachable Green Fluorescent Protein (GFP)-8His tag, which facilitates localization, quantification, quality control, and purification. Using this expression system we examined the production of a human glucose transceptor and 11 nutrient transporters and transceptors from S. cerevisiae that have not previously been overexpressed in S. cerevisiae and purified. Whole-cell GFP-fluorescence showed that induction of GFP-fusion synthesis from a galactose-inducible promoter at 15°C resulted in stable accumulation of the fusions in the plasma membrane and in intracellular membranes. Expression levels of the 12 fusions estimated by GFP-fluorescence were in the range of 0.4 mg to 1.7 mg transporter pr. liter cell culture. A detergent screen showed that n-dodecyl-ß-D-maltopyranoside (DDM) is acceptable for solubilization of the membrane-integrated fusions. Extracts of solubilized membranes were prepared with this detergent and used for purifications by Ni-NTA affinity chromatography, which yielded partially purified full-length fusions. Most of the fusions were readily cleaved at a TEV protease site between the membrane protein and the GFP-8His tag. Using the yeast oligopeptide transporter Ptr2 as an example, we further demonstrate that almost pure transporters, free of the GFP-8His tag, can be achieved by TEV protease cleavage followed by reverse immobilized metal-affinity chromatography. The quality of the GFP-fusions was analysed by fluorescence size-exclusion chromatography. Membranes solubilized in DDM resulted in preparations containing aggregated fusions. However, 9 of the fusions solubilized in DDM in presence of cholesteryl hemisuccinate and specific substrates, yielded monodisperse preparations with only minor amounts of aggregated membrane proteins. In conclusion, we developed a new effective S. cerevisiae expression system that may be used for production of high-quality eukaryotic membrane proteins for functional and structural analysis.
Project description:Biochemical and structural analysis of membrane proteins often critically depends on the ability to overexpress and solubilize them. To identify properties of eukaryotic membrane proteins that may be predictive of successful overexpression, we analyzed expression levels of the genomic complement of over 1000 predicted membrane proteins in a recently completed Saccharomyces cerevisiae protein expression library. We detected statistically significant positive and negative correlations between high membrane protein expression and protein properties such as size, overall hydrophobicity, number of transmembrane helices, and amino acid composition of transmembrane segments. Although expression levels of membrane and soluble proteins exhibited similar negative correlations with overall hydrophobicity, high-level membrane protein expression was positively correlated with the hydrophobicity of predicted transmembrane segments. To further characterize yeast membrane proteins as potential targets for structure determination, we tested the solubility of 122 of the highest expressed yeast membrane proteins in six commonly used detergents. Almost all the proteins tested could be solubilized using a small number of detergents. Solubility in some detergents depended on protein size, number of transmembrane segments, and hydrophobicity of predicted transmembrane segments. These results suggest that bioinformatic approaches may be capable of identifying membrane proteins that are most amenable to overexpression and detergent solubilization for structural and biochemical analyses. Bioinformatic approaches could also be used in the redesign of proteins that are not intrinsically well-adapted to such studies.
Project description:Cdc48, known as p97 or valosin-containing protein (VCP) in mammals, is an abundant AAA-ATPase that is essential for many ubiquitin-dependent processes. One well-documented role for Cdc48 is in facilitating the delivery of ubiquitylated misfolded endoplasmic reticulum proteins to the proteasome for degradation. By contrast, the role for Cdc48 in misfolded protein degradation in the nucleus is unknown. In the budding yeast Saccharomyces cerevisiae, degradation of misfolded proteins in the nucleus is primarily mediated by the nuclear-localized ubiquitin-protein ligase San1, which ubiquitylates misfolded nuclear proteins for proteasomal degradation. Here, we find that, although Cdc48 is involved in the degradation of some San1 substrates, it is not universally required. The difference in the requirement for Cdc48 correlates with the insolubility of the San1 substrate. The more insoluble the substrate, the more its degradation requires Cdc48. Expression of Cdc48-dependent San1 substrates in mutant cdc48 cells results in increased substrate insolubility, larger inclusion formation and reduced cell viability. Substrate ubiquitylation is increased in mutant cdc48 cells, suggesting that Cdc48 functions downstream of San1. Taken together, we propose that Cdc48 acts, in part, to maintain the solubility or reverse the aggregation of insoluble misfolded proteins prior to their proteasomal degradation.
Project description:BACKGROUND: Eukaryotic ubiquitin and SUMO are frequently used as tags to enhance the fusion protein expression in microbial host. They increase the solubility and stability, and protect the peptides from proteolytic degradation due to their stable and highly conserved structures. Few of prokaryotic ubiquitin-like proteins was used as fusion tags except ThiS, which enhances the fusion expression, however, reduces the solubility and stability of the expressed peptides in E. coli. Hence, we investigated if MoaD, a conserved small sulfur carrier in prokaryotes with the similar structure of ubiquitin, could also be used as fusion tag in heterologous expression in E. coli. RESULTS: Fusion of MoaD to either end of EGFP enhanced the expression yield of EGFP with a similar efficacy of ThiS. However, the major parts of the fusion proteins were expressed in the aggregated form, which was associated with the retarded folding of EGFP, similar to ThiS fusions. Fusion of MoaD to insulin chain A or B did not boost their expression as efficiently as ThiS tag did, probably due to a less efficient aggregation of products. Interestingly, fusion of MoaD to the murine ribonuclease inhibitor enhanced protein expression by completely protecting the protein from intracellular degradation in contrast to ThiS fusion, which enhanced degradation of this unstable protein when expressed in E. coli. CONCLUSIONS: Prokaryotic ubiquitin-like protein MoaD can act as a fusion tag to promote the fusion expression with varying mechanisms, which enriches the arsenal of fusion tags in the category of insoluble expression.
Project description:While access to soluble recombinant proteins is essential for a number of proteome studies, preparation of purified functional proteins is often limited by the protein solubility. In this study, potent solubility-enhancing fusion partners were screened from the repertoire of endogenous E. coli proteins. Based on the presumed correlation between the intracellular abundance and folding efficiency of proteins, PCR-amplified ORFs of a series of highly abundant E. coli proteins were fused with aggregation-prone heterologous proteins and then directly expressed for quantitative estimation of the expression efficiency of soluble translation products. Through two-step screening procedures involving the expression of 552 fusion constructs targeted against a series of cytokine proteins, we were able to discover a number of endogenous E. coli proteins that dramatically enhanced the soluble expression of the target proteins. This strategy of cell-free expression screening can be extended to quantitative, global analysis of genomic resources for various purposes.
Project description:Ticks transmit numerous pathogens, including borreliae, which cause Lyme disease. Tick saliva contains a complex mix of anti-host defense factors, including the immunosuppressive cysteine-rich secretory glycoprotein Salp15 from Ixodes scapularis ticks and orthologs like Iric-1 from Ixodes ricinus. All tick-borne microbes benefit from the immunosuppression at the tick bite site; in addition, borreliae exploit the binding of Salp15 to their outer surface protein C (OspC) for enhanced transmission. Hence, Salp15 proteins are attractive targets for anti-tick vaccines that also target borreliae. However, recombinant Salp proteins are not accessible in sufficient quantity for either vaccine manufacturing or for structural characterization. As an alternative to low-yield eukaryotic systems, we investigated cytoplasmic expression in Escherichia coli, even though this would not result in glycosylation. His-tagged Salp15 was efficiently expressed but insoluble. Among the various solubility-enhancing protein tags tested, DsbA was superior, yielding milligram amounts of soluble, monomeric Salp15 and Iric-1 fusions. Easily accessible mutants enabled epitope mapping of two monoclonal antibodies that, importantly, cross-react with glycosylated Salp15, and revealed interaction sites with OspC. Free Salp15 and Iric-1 from protease-cleavable fusions, despite limited solubility, allowed the recording of (1)H-(15)N 2D NMR spectra, suggesting partial folding of the wild-type proteins but not of Cys-free variants. Fusion to the NMR-compatible GB1 domain sufficiently enhanced solubility to reveal first secondary structure elements in (13)C/(15)N double-labeled Iric-1. Together, E. coli expression of appropriately fused Salp15 proteins may be highly valuable for the molecular characterization of the function and eventually the 3D structure of these medically relevant tick proteins.