Predictive design of sigma factor-specific promoters
Ontology highlight
ABSTRACT: To engineer synthetic gene circuits, molecular building blocks are developed which can modulate gene expression without interference, mutually or with the host’s cell machinery. Promoter libraries of E. coli sigma factor 70 and B. subtilis B-, F- and W-dependent promoters are exploited to construct prediction models, capable of both predicting promoter TIF and orthogonality of the specific promoters. This is achieved by the creation of high-throughput DNA sequencing data from fluorescence-activated cell sorted promoter libraries.
INSTRUMENT(S): Illumina MiSeq
ORGANISM(S): Escherichia coli str. K-12 substr. MG1655
Project description:Despite extensive research, our understanding of the rules according to which cis-regulatory sequences are converted into gene expression is limited We devised a method for obtaining parallel, highly accurate gene expression measurements from thousands of designed promoters and applied it to measure the effect of systematic changes in the location, number, orientation, affinity and organization of transcription-factor binding sites and nucleosome-disfavoring sequences. Our analyses reveal a clear relationship between expression and binding-site multiplicity, as well as dependencies of expression on the distance between transcription-factor binding sites and gene starts which are transcription-factor specific, including a striking ~10-bp periodic relationship between gene expression and binding-site location. We show how this approach can measure transcription-factor sequence specificities and the sensitivity of transcription-factor sites to the surrounding sequence context, and we compare the activity of 75 yeast transcription factors. Our method can be used to study both cis and trans effects of genotype on transcriptional, post-transcriptional and translational control. Expression profiling of 6500 synthetic promoters in yeast (Saccharomyces cerevisiae). The results were achieved using a method for obtaining pooled and highly accurate expression measurements (closely related to MPRA). In short, the synthetic promoters, which contain unique barcodes, are inserted upstream to a yellow florescence reporter gene, the cells are sorted by their florescence level (using FACS) to expression bins, the promoters in each expression bin are amplified and are send to parallel sequencing (SOLiD) to determine the percent of cells of each promoter in each expression bin; finally an mean expression value is extracted from the raw results. The measurements contain two replicates of exponentially growing yeast cells in SC-Gal-URA (synthetic complete media with 2% galactose and without uracil).
Project description:Genetically identical cells exhibit large variability (noise) in gene expression, with important consequences for cellular function. Although the amount of noise decreases with and is thus partly determined by the mean expression level, the extent to which different promoter sequences can deviate away from this trend is not known. Here, we study how different noise levels are encoded by the promoter sequence using massively parallel noise measurements of thousands of synthetically designed promoters. We find that the noise levels of promoters with similar mean expression levels can vary over more than one order of magnitude, with nucleosome-disfavoring sequences resulting in lower noise and more transcription factor binding sites resulting in higher noise. We devised a computational model that can accurately predict the mean-independent component of the noise from DNA sequence alone. Our model suggests that the effect of promoters on noise is partly mediated by the combination of non-specific DNA binding and one-dimensional sliding along the DNA that occurs when transcription factors search for their target sites. Overall, our results demonstrate that small changes in the DNA sequence of promoters can allow tuning of noise levels in a manner that is largely predictable and partly decoupled from effects on the mean expression levels. These insights may assist in designing promoters with desired noise levels. Expression measurements of a collection of synthetic promoters collection that was published in Sharon et al. Nature Biotechnology 2012(doi: 10.1038/nbt.2205). Two replicates of the promoter library integrated into a plasmid in yeast were measured in SC-Glu-URA medium. The promoter library was measured as described in Sharon et. al.(Sharon et al. 2012), except for the differences below. Briefly, a large collection of synthetic promoter reporter gene strains was generated by a pooled ligation of 6500 fully designed DNA oligos (obtained by synthesis on a microarray(LeProust et al. 2010) by Agilent Technologies, Santa Clara, California). The oligos were ligated upstream to a yellow fluorescent protein (YFP) gene with a short (100 bp) core promoter sequence taken from HIS3 gene promoter and into a low copy plasmid which also contains a TEF2 promoter deriving red fluorescent protein (mCherry). The resulting plasmids were then transformation into yeast (S. cerevisiae). Next, the pool of cells was grown in amino acid starvation condition (SCD without amino acid except Histidine), and sorted according to their YFP expression level into 32 expression bins (mCherry was used for gating one plasmid copy cells and for normalization). The DNA of the promoters in each bin were then amplified and sent to multiplexed parallel sequencing. Each sequencing result was mapped to a specific promoter and expression bin, resulting in a distribution of cells that contain each promoter across all expression bins. The following differences were applied relative to the description in Sharon et. al.(Sharon et al. 2012). The medium used both for growing the cells and for their sorting was SC-Glu-URA (synthetic complete media with 2% glucose and without uracil) medium without amino acids, except for Histidine. In order to achieve expression distributions with high resolution that would allow good assessment of expression noise, the library cells were sorted into 32 bins according to their ratio of YFP and mCherry expression level, thereby normalizing for extrinsic noise effects. Each of the two extreme expression bins contained 2% of the library cells and each of the remaining 30 bins contained 3.2%. We collected a total of 10,000,000 cells. As previously described, the mapping of cells to bins involves parallel sequencing of the amplified promoter regions. For this purpose, Illumina Hi-Seq 2000 was used to obtain >30,000,000 mapped reads. The two replicates were separately generated from the ssDNA oligo library and separately measured as described above.
Project description:We developed a high-throughput mutagenesis screen to comprehensively identify the cis-regulatory elements that control a target splicing event from the MST1R gene that codes for the RON receptor tyrosine kinase. Skipping of alternative exon 11 results in a constitutively active isoform that promotes epithelial to mesenchymal transition and thereby contributes to the invasive phenotype of tumors. First, we created a library of mutated minigenes via mutagenic PCR. Importantly, the reverse primer introduced a random barcode sequence which labels the associated mutations. Next, the plasmid library was transfected as a pool and depending on the mutations, the transcripts exhibit changes in alternative splicing. The minigene library and the splicing outcome were analyzed by next-generation sequencing and subsequent integration of the datasets resulted in a map of splicing regulatory sites. The DNA-seq experiment was performed to map the mutations and the associated barcodes in order to identify all minigene variants in the library. For sequencing, we generated five overlapping amplicons of the minigene library using four different forward primers: 5’CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCTNNNNNNNNNNCTATAGGGAGACCCAAGCTT 3’, 5’CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCTNNNNNNNNNNGTTCCACTGAAGCCTGAG 3’, 5’CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCTNNNNNNNNNNAGCTGCCAGCACGAGTTC 3’, 5’CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCTNNNNNNNNNNGAATCTGAGTGCCCGAGG 3’, and 5’CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCTNNNNNNNNNNctactggctggtcctcatga 3’, and 5’AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNNNATAGAATAGGGCCCTCTAGA 3’ as a common reverse primer. After amplification, the PCR products were cleaned using the GeneRead size selection Kit (QIAGEN) according to manufacturer’s instructions. The purified products were first analysed with the TapeStation 2200 capillary gel electrophoresis instrument (Agilent) and then fluorimetrically quantified using a Qubit fluorimeter (Thermo Scientific). Sequencing was carried out on the Illumina MiSeq platform using paired-end reads of 300 nt length and a 10% PhiX spike-in to increase sequence complexity.
Project description:The evolutional trajectory of gut microbial colonization from birth has been shown to prime for health later in life. Here, we combined cultivation-independent 16S rRNA gene sequencing and metaproteomics to investigate the functional maturation of gut microbiota in faecal samples from full-term healthy infants collected at 6 and 18 months of age. Phylogenetic analysis of the metaproteomes showed that Bifidobacterium provided the highest number of distinct protein groups. Considerable divergences between taxa abundance and protein phylogeny were observed at all taxonomic ranks. Age had a profound effect on early microbiota where compositional and functional complexity of less dissimilar communities increased with time. Comparisons of the relative abundances of proteins revealed the transition of taxon-associated saccharolytic and carbon metabolism strategies from catabolic pathways of milk and mucin-derived monosaccharides feeding acetate/propanoate synthesis to complex food sugars fuelling butyrate production. Furthermore, co-occurrence network analysis uncovered two anti-correlated modules of functional taxa. A low-connected Bifidobacteriaceae-centred guild of facultative anaerobes was succeeded by a rich club of obligate anaerobes densely interconnected around Lachnospiraceae, underpinning their pivotal roles in microbial ecosystem assemblies. Our findings establish a framework to visualize whole microbial community metabolism and ecosystem succession dynamics, proposing opportunities for microbiota-targeted health-promoting strategies early in life.
Project description:Frequent long-range epigenetic silencing of protocadherin gene clusters on chromosome 5q31 in Wilms' tumour The data consists of five microarrays hybridized with methylated DNA immunoprecipitated from Wilms' tumours and from a normal foetal kidney control
Project description:Characterization of a metagenomic regulatory sequence library derived from M. xanthus, E. coli, and O. urethralis genomes in strains expressing different RpoD ortholog variants. Targeted DNA and RNA seq used to profile relative DNA and RNA abundances, respectively of each regulatory sequence construct in the library.
Project description:Reporter genes integrated into the genome are a powerful tool to reveal effects of regulatory elements and local chromatin context on gene expression. However, so far such reporter assays have been of low throughput. Here we describe a multiplexing approach for the parallel monitoring of transcriptional activity of thousands of randomly integrated reporters. More than 27,000 distinct reporter integrations in mouse embryonic stem cells, obtained with two different promoters, show ~1,000-fold variation in expression levels. Data analysis indicates that lamina-associated domains act as attenuators of transcription, likely by reducing access of transcription factors to binding sites. Furthermore, chromatin compaction is predictive of reporter activity. We also found evidence for cross-talk between neighboring genes, and estimate that enhancers can influence gene expression on average over ~20 kb. The multiplexed reporter assay is highly flexible in design and can be modified to query a wide range of aspects of gene regulation. TRIP assay of mPGK promoter; 6 experiments, each with 2 technical replicates.
Project description:Gemcitabine treatment shifts the intestinal microbiota of PC mice towards an inflammatory profile which may worsen mucositis and side effects observed upon chemotherapy. We explored the effect of a specific probiotics blend administered, with or without gemcitabine treatment, to PC xenografted mice.
Project description:Based on the hypothesis that, enhancing the local concentration of donor oligos could increase the correction rates, we generated and tested novel CRISPR-Cas9 systems, in which the DNA repair template is covalently conjugated to Cas9 (RNPD system). To validate our results from the HEK293T reporter cells, we here tested our approach at different endogenous genomic loci and in different cell types. We first targeted the human beta globin (HBB) locus in the K562 cell line, and analyzed correction- and editing frequencies using next generation sequencing (NGS). Next we targeted the Rosa26 and proprotein convertase subtilisin/kexin type 9 (Pcsk9) locus in mouse embryonic stem cells (mESCs). Here, RNPD system was always compared to Cas9 SNAP-tag fusion proteins with uncoupled donor oligos. To also directly compare the engineered RNPD system to the classical CRISPR-Cas9 system, we performed experiments where we used wild-type Cas9 with the uncoupled donor oligos as a control. We therefore targeted the fluorescent reporter locus as well as the endogenous loci HBB, empty spiracles homeobox 1 (EMX1), and C-X-C chemokine receptor type 4 (CXCR4) in HEK293T cells. Finally, we performed the analysis of three computationally predicted off-target sites of the reporter locus.