Design and optimization of short DNA sequences that can be used as 5' fusion partners for high-level expression of heterologous genes in Escherichia coli.
ABSTRACT: The 5' terminal nucleotide sequence of a gene is often a bottleneck in recombinant protein production. The ifn-?2bS gene is poorly expressed in Escherichia coli unless a translocation signal sequence (pelB) is fused to the 5' end of the gene. A combined in silico and in vivo analysis reported here further indicates that the ifn-?2bS 5' coding sequence is suboptimal for efficient gene expression. ifn-?2bS therefore presents a suitable model gene for describing properties of 5' fusions promoting expression. We show that short DNA sequences corresponding to the 5' end of the highly expressed celB gene, whose protein product is cytosolic, can functionally replace pelB as a 5' fusion partner for efficient ifn-?2bS expression. celB fusions of various lengths (corresponding to a minimum of 8 codons) led to more than 7- and 60-fold stimulation of expression at the transcript and protein levels, respectively. Moreover, the presence of a celB-based fusion partner was found to moderately reduce the decay rate of the corresponding transcript. The 5' fusions thus appear to act by enhancing translation, and bound ribosomes may accordingly contribute to increased mRNA stability and reduced mRNA decay. However, other effects, such as altered protein stability, cannot be excluded. We also developed an experimental protocol that enabled us to identify improved variants of the celB fusion, and one of these (celBD11) could be used to additionally increase ifn-?2bS expression more than 4-fold at the protein level. Interestingly, celBD11 also stimulated greater protein production of three other medically important human genes than the wild-type celB fragment.
Project description:Interleukin-12 (IL-12), a potent inducer of interferon gamma (IFN?), is a heterodimeric protein consisting of p40 and p35 subunits whose expression is regulated independently. IL-12 is part of a cytokine family (currently consisting of IL-12, IL-23, IL-27, and IL-35) that can have profoundly different immunologic effects, despite sharing subunits. In constructing a single-chain fusion of p40 and p35, we discovered an insert corresponding to an intron in the gene encoding the p35 subunit that would result in a truncated form of p35 if translated. To test its possible role, we constructed, expressed, and analyzed fusions of p40 with the full-length or the truncated form of p35. The fusion protein containing the truncated p35 did not stimulate the proliferation of the IL-12-responsive cell line CTLL-2 nor did it induce IFN? or the chemokine IFN?-inducible protein 10 (IP-10, CXCL10) or monokine induced by IFN? (MIG, CXCL9) from spleen cells. In striking contrast, the full-length IL-12 p40/p35 fusion induced robust responses in both assays. Moreover, the truncated IL-12 fusion protein inhibited the action of the full-length IL-12 p40/p35 fusion in the proliferation assay and also blocked the induction of IFN?. These findings raise the possibility that alternative splicing may provide an additional regulatory mechanism for IL-12.
Project description:Recombinant human endostatin (rhEs) is an angiogenesis inhibitor which is used as a specific drug in the treatment of non-small-cell lung cancer. In the current research, we developed an efficient method for expressing soluble form of the rhEs protein in the periplasmic space of Escherichia coli via fusing with pelB signal peptide.The human endostatin (hEs) gene was amplified using synthetic (hEs) gene as a template; then, cloned and expressed under T7 lac promoter. IPTG was used as an inducer for rhEs expression. Next, the osmotic shock was used to extraction of protein from the periplasmic space. The presence of rhEs in the periplasmic space was approved by SDS-PAGE and Western blotting.The results show the applicability of pelB fusion protein system usage for secreting rhEs in the periplasm of E. coli in the laboratory scale. The rhEs represents approximately 35 % (0.83mg/l) of the total cell protein.The present study apparently is the ﬁrst report of codon-optimized rhEs expression as a fusion with pelB signal peptide. The results presented the successful secretion of soluble rhEs to the periplasmic space.
Project description:The phytopathogenic enterobacterium Erwinia chrysanthemi 3937 produces five major and several secondary endo-pectate lyases encoded by the pel genes. Most of these genes are arranged in clusters on the bacterial chromosome. The genomic region surrounding the pelB-pelC cluster was supposed to be involved in the regulation of PelB and PelC synthesis. We demonstrated that the variation of pelB expression resulted from the titration of a regulatory protein by the gene adjacent to pelC. This gene was renamed pelZ since it encodes a protein of 420 amino acids with an endo-pectate lyase activity. Regulation of pelZ expression was investigated by using transcriptional fusions and a study of mRNA synthesis. Its transcription depends on different environmental conditions. It is induced in planta and in the presence of pectic catabolite products. This induction seems to be partially mediated by the KdgR protein but does not result from a direct interaction of KdgR with the pelZ 5' region. The transcription of pelZ leads to the synthesis of a monocistronic mRNA. However, the synthesis of a polycistronic mRNA from the pelC promoter, regulated by KdgR, is responsible for increased production of PelZ under inducing conditions. pelZ transcription is also controlled by pecT, which regulates some other pel genes, but it is independent of the pecS regulatory locus. The pelZ gene appears to be widespread in different strains of E. chrysanthemi. Moreover, a gene homologous to pelZ exists in Erwinia carotovora subsp. atroseptica adjacent to the cluster containing the pectate lyase-encoding genes pel1, pel2, and pel3. This conservation could reflect a significant role of PelZ in the pectinolytic system of Erwiniae. We showed pelZ is not a predominant virulence factor of E. chrysanthemi but is involved in host specificity.
Project description:Transcript fusions as a result of chromosomal rearrangements have been a focus of attention in cancer as they provide attractive therapeutic targets. To identify novel fusion transcripts with the potential to be exploited therapeutically, we analyzed RNA sequencing, DNA copy number and gene mutation data from 4366 primary tumor samples. To avoid false positives, we implemented stringent quality criteria that included filtering of fusions detected in RNAseq data from 364 normal tissue samples. Our analysis identified 7887 high confidence fusion transcripts across 13 tumor types. Our fusion prediction was validated by evidence of a genomic rearrangement for 78 of 79 fusions in 48 glioma samples where whole-genome sequencing data were available. Cancers with higher levels of genomic instability showed a corresponding increase in fusion transcript frequency, whereas tumor samples harboring fusions contained statistically significantly fewer driver gene mutations, suggesting an important role for tumorigenesis. We identified at least one in-frame protein kinase fusion in 324 of 4366 samples (7.4%). Potentially druggable kinase fusions involving ALK, ROS, RET, NTRK and FGFR gene families were detected in bladder carcinoma (3.3%), glioblastoma (4.4%), head and neck cancer (1.0%), low-grade glioma (1.5%), lung adenocarcinoma (1.6%), lung squamous cell carcinoma (2.3%) and thyroid carcinoma (8.7%), suggesting a potential for application of kinase inhibitors across tumor types. In-frame fusion transcripts involving histone methyltransferase or histone demethylase genes were detected in 111 samples (2.5%) and may additionally be considered as therapeutic targets. In summary, we described the landscape of transcript fusions detected across a large number of tumor samples and revealed fusion events with clinical relevance that have not been previously recognized. Our results support the concept of basket clinical trials where patients are matched with experimental therapies based on their genomic profile rather than the tissue where the tumor originated.
Project description:Gene fusions are being discovered at an increasing rate using massively parallel sequencing technologies. Prioritization of cancer fusion drivers for validation cannot be performed using traditional single-gene based methods because fusions involve portions of two partner genes. To address this problem, we propose a novel network analysis method called fusion centrality that is specifically tailored for prioritizing gene fusions. We first propose a domain-based fusion model built on the theory of exon/domain shuffling. The model leads to a hypothesis that a fusion is more likely to be an oncogenic driver if its partner genes act like hubs in a network because the fusion mutation can deregulate normal functions of many other genes and their pathways. The hypothesis is supported by the observation that for most known cancer fusion genes, at least one of the fusion partners appears to be a hub in a network, and even for many fusions both partners appear to be hubs. Based on this model, we construct fusion centrality, a multi-gene-based network metric, and use it to score fusion drivers. We show that the fusion centrality outperforms other single gene-based methods. Specifically, the method successfully predicts most of 38 newly discovered fusions that had validated oncogenic importance. To our best knowledge, this is the first network-based approach for identifying fusion drivers.Matlab code implementing the fusion centrality method is available upon request from the corresponding authors.
Project description:The signal peptide plays a key role in targeting and membrane insertion of secretory and membrane proteins in both prokaryotes and eukaryotes. In E. coli, recombinant proteins can be targeted to the periplasmic space by fusing naturally occurring signal sequences to their N-terminus. The model protein thioredoxin was fused at its N-terminus with malE and pelB signal sequences. While WT and the pelB fusion are soluble when expressed, the malE fusion was targeted to inclusion bodies and was refolded in vitro to yield a monomeric product with identical secondary structure to WT thioredoxin. The purified recombinant proteins were studied with respect to their thermodynamic stability, aggregation propensity and activity, and compared with wild type thioredoxin, without a signal sequence. The presence of signal sequences leads to thermodynamic destabilization, reduces the activity and increases the aggregation propensity, with malE having much larger effects than pelB. These studies show that besides acting as address labels, signal sequences can modulate protein stability and aggregation in a sequence dependent manner.
Project description:Gene expression profiling provides powerful analyses of transcriptional responses to cellular perturbation. In contrast to DNA array-based methods, reporter gene technology has been underused for this application. Here we describe a genomewide, genome-registered collection of Escherichia coli bioluminescent reporter gene fusions. DNA sequences from plasmid-borne, random fusions of E. coli chromosomal DNA to a Photorhabdus luminescens luxCDABE reporter allowed precise mapping of each fusion. The utility of this collection covering about 30% of the transcriptional units was tested by analyzing individual fusions representative of heat shock, SOS, OxyR, SoxRS, and cya/crp stress-responsive regulons. Each fusion strain responded as anticipated to environmental conditions known to activate the corresponding regulatory circuit. Thus, the collection mirrors E. coli's transcriptional wiring diagram. This genomewide collection of gene fusions provides an independent test of results from other gene expression analyses. Accordingly, a DNA microarray-based analysis of mitomycin C-treated E. coli indicated elevated expression of expected and unanticipated genes. Selected luxCDABE fusions corresponding to these up-regulated genes were used to confirm or contradict the DNA microarray results. The power of partnering gene fusion and DNA microarray technology to discover promoters and define operons was demonstrated when data from both suggested that a cluster of 20 genes encoding production of type I extracellular polysaccharide in E. coli form a single operon.
Project description:Fusion proteins play an important role in the production of recombinant proteins in Escherichia coli. They are mostly used for cytoplasmic expression since they can be designed to increase the solubility of the target protein, which then can be easily purified via affinity chromatography. In contrast, fusion proteins are not usually included in construct designs for periplasmic production. Instead, a signal sequence is inserted for protein transport into the periplasm and a C-terminal his-tag added for subsequent purification. Our research group has proposed the small metal-binding protein (SmbP) isolated from the periplasm of Nitrosomonas europaea as a new fusion protein to express recombinant proteins in the cytoplasm or periplasm of E. coli. SmbP also allows purification via immobilized metal affinity chromatography using Ni(II) ions. Recently, we have optimized the periplasmic production of proteins tagged with SmbP by exchanging its native signal peptide with one taken from pectate lyase B (PelB), substantially increasing the amount of protein produced. In this work, we have expressed and purified soluble bioactive human growth hormone (hGH) tagged with PelB-SmbP and obtained the highest periplasmic production reported for this protein so far. Its activity, tested on Nb2-11 cells, was equivalent to commercial growth hormone at 50 ng·mL-1 . Therefore, we strongly recommend the use of PelB-SmbP as a protein tag for the expression and purification of hGH or other possible target proteins in the periplasm of E. coli.
Project description:We report the design and implementation of a "breakpoint analysis" pipeline to discover novel gene fusions by tell-tale transcript level or genomic DNA copy number transitions occurring within genes. We use this method to prioritize candidate rearrangements from high density array CGH datasets as well as exon-resolution expression microarrays. We mine both publicly available data as well as datasets generated in our laboratory. Several gene fusion candidates were chosen for further characterization, and corresponding samples were profiled using paired end RNA sequencing to discover the identity of the gene fusion. Using this approach, we report the discovery and characterization of novel gene fusions spanning multiple cancer subtypes including angiosarcoma, pancreatic cancer, anaplastic astrocytoma, melanoma, breast cancer, and T-cell acute lymphoblastic leukemia. Taken together, this study provides a robust approach for gene fusion discovery, and our results highlight a more widespread role of fusion genes in cancer pathogenesis. Breakpoint analysis for the discovery of novel gene fusions across human cancers
Project description:RNA-seq is a well-established method for studying the transcriptome. Popular methods for library preparation in RNA-seq such as Illumina TruSeq® RNA v2 kit use a poly-A pulldown strategy. Such methods can cause loss of coverage at the 5' end of genes, impacting the ability to detect fusions when used on degraded samples. The goal of this study was to quantify the effects RNA degradation has on fusion detection when using poly-A selected mRNA and to identify the variables involved in this process.Using both artificially and naturally degraded samples, we found that there is a reduced ability to detect fusions as the distance of the breakpoint from the 3' end of the gene increases. The median transcript coverage decreases exponentially as a function of the distance from the 3' end and there is a linear relationship between the coverage decay rate and the RNA integrity number (RIN). Based on these findings we developed plots that show the probability of detecting a gene fusion ("sensitivity") as a function of the distance of the fusion breakpoint from the 3' end.This study developed a strategy to assess the impact that RNA degradation has on the ability to detect gene fusions by RNA-seq.