Predicting strength and function for promoters of the Escherichia coli alternative sigma factor, sigmaE.
ABSTRACT: Sequenced bacterial genomes provide a wealth of information but little understanding of transcriptional regulatory circuits largely because accurate prediction of promoters is difficult. We examined two important issues for accurate promoter prediction: (1) the ability to predict promoter strength and (2) the sequence properties that distinguish between active and weak/inactive promoters. We addressed promoter prediction using natural core promoters recognized by the well-studied alternative sigma factor, Escherichia coli sigma(E), as a representative of group 4 sigmas, the largest sigma group. To evaluate the contribution of sequence to promoter strength and function, we used modular position weight matrix models comprised of each promoter motif and a penalty score for suboptimal motif location. We find that a combination of select modules is moderately predictive of promoter strength and that imposing minimal motif scores distinguished active from weak/inactive promoters. The combined -35/-10 score is the most important predictor of activity. Our models also identified key sequence features associated with active promoters. A conserved "AAC" motif in the -35 region is likely to be a general predictor of function for promoters recognized by group 4 sigmas. These results provide valuable insights into sequences that govern promoter strength, distinguish active and inactive promoters for the first time, and are applicable to both in vivo and in vitro measures of promoter strength.
Project description:Bacteria often cope with environmental stress by inducing alternative sigma (sigma) factors, which direct RNA polymerase to specific promoters, thereby inducing a set of genes called a regulon to combat the stress. To understand the conserved and organism-specific functions of each sigma, it is necessary to be able to predict their promoters, so that their regulons can be followed across species. However, the variability of promoter sequences and motif spacing makes their prediction difficult. We developed and validated an accurate promoter prediction model for Escherichia coli sigmaE, which enabled us to predict a total of 89 unique sigmaE-controlled transcription units in E. coli K-12 and eight related genomes. SigmaE controls the envelope stress response in E. coli K-12. The portion of the regulon conserved across genomes is functionally coherent, ensuring the synthesis, assembly, and homeostasis of lipopolysaccharide and outer membrane porins, the key constituents of the outer membrane of Gram-negative bacteria. The larger variable portion is predicted to perform pathogenesis-associated functions, suggesting that sigmaE provides organism-specific functions necessary for optimal host interaction. The success of our promoter prediction model for sigmaE suggests that it will be applicable for the prediction of promoter elements for many alternative sigma factors.
Project description:Sigma32 controls expression of heat shock genes in Escherichia coli and is widely distributed in proteobacteria. The distinguishing feature of sigma32 promoters is a long -10 region (CCCCATNT) whose tetra-C motif is important for promoter activity. Using alanine-scanning mutagenesis of sigma32 and in vivo and in vitro assays, we identified promoter recognition determinants of this motif. The most downstream C (-13) is part of the -10 motif; our work confirms and extends recognition determinants of -13C. Most importantly, our work suggests that the two upstream Cs (-16, -15) constitute an 'extended -10' recognition motif that is recognized by K130, a residue universally conserved in beta- and gamma-proteobacteria. This residue is located in the alpha-helix of sigmaDomain 3 that mediates recognition of the extended -10 promoter motif in other sigmas. K130 is not conserved in alpha- and delta-/epsilon-proteobacteria and we found that sigma32 from the alpha-proteobacterium Caulobacter crescentus does not need the extended -10 motif for high promoter activity. This result supports the idea that K130 mediates extended -10 recognition. Sigma32 is the first Group 3 sigma shown to use the 'extended -10' recognition motif.
Project description:BACKGROUND: In prokaryotes, sigma factors are essential for directing the transcription machinery towards promoters. Various sigma factors have been described that recognize, and bind to specific DNA sequence motifs in promoter sequences. The canonical sigma factor ?(70) is commonly involved in transcription of the cell's housekeeping genes, which is mediated by the conserved ?(70) promoter sequence motifs. In this study the ?(70)-promoter sequences in Lactobacillus plantarum WCFS1 were predicted using a genome-wide analysis. The accuracy of the transcriptionally-active part of this promoter prediction was subsequently evaluated by correlating locations of predicted promoters with transcription start sites inferred from the 5'-ends of transcripts detected by high-resolution tiling array transcriptome datasets. RESULTS: To identify ?(70)-related promoter sequences, we performed a genome-wide sequence motif scan of the L. plantarum WCFS1 genome focussing on the regions upstream of protein-encoding genes. We obtained several highly conserved motifs including those resembling the conserved ?(70)-promoter consensus. Position weight matrices-based models of the recovered ?(70)-promoter sequence motif were employed to identify 3874 motifs with significant similarity (p-value<10(-4)) to the model-motif in the L. plantarum genome. Genome-wide transcript information deduced from whole genome tiling-array transcriptome datasets, was used to infer transcription start sites (TSSs) from the 5'-end of transcripts. By this procedure, 1167 putative TSSs were identified that were used to corroborate the transcriptionally active fraction of these predicted promoters. In total, 568 predicted promoters were found in proximity (? 40 nucleotides) of the putative TSSs, showing a highly significant co-occurrence of predicted promoter and TSS (p-value<10(-263)). CONCLUSIONS: High-resolution tiling arrays provide a suitable source to infer TSSs at a genome-wide level, and allow experimental verification of in silico predicted promoter sequence motifs.
Project description:Predicting the location and strength of promoters from genomic sequence requires accurate sequenced-based promoter models. We present the first model of a full-length bacterial promoter, encompassing both upstream sequences (UP-elements) and core promoter modules, based on a set of 60 promoters dependent on ?(E), an alternative ECF-type ? factor. UP-element contribution, best described by the length and frequency of A- and T-tracts, in combination with a PWM-based core promoter model, accurately predicted promoter strength both in vivo and in vitro. This model also distinguished active from weak/inactive promoters. Systematic examination of promoter strength as a function of RNA polymerase (RNAP) concentration revealed that UP-element contribution varied with RNAP availability and that the ?(E) regulon is comprised of two promoter types, one of which is active only at high concentrations of RNAP. Distinct promoter types may be a general mechanism for increasing the regulatory capacity of the ECF group of alternative ?'s. Our findings provide important insights into the sequence requirements for the strength and function of full-length promoters and establish guidelines for promoter prediction and for forward engineering promoters of specific strengths.
Project description:Background: In prokaryotes, sigma factors are essential for the targeting of the transcription machinery to promoters. Various sigma factors have been described that recognize and bind to specific DNA sequence motifs in promoter sequences. The canonical sigma factor M-OM-^C70 is commonly involved in transcription of the cell's general housekeeping genes, which is mediated by the conserved M-OM-^C70 promoter sequence motifs. This study detects and predicts the general M-OM-^C70-promoter sequences in Lactobacillus plantarum WCFS1 using genome-wide analysis. The accuracy of the transcriptionally-active part of this promoter prediction was subsequently evaluated by correlating locations of predicted promoters with experimentally detected transcription starts sites (TSSs) using high-resolution tiling array transcriptome datasets. Results: To identify M-OM-^C70-related promoter sequences, we performed a genome-wide sequence motif scan of the L. plantarum WCFS1 genome focusing on the regions upstream of protein-encoding genes. We obtained several highly conserved motifs including those resembling the conserved M-OM-^C70-promoter consensus. Position weight matrices-based models of the recovered M-OM-^C70-promoter sequence motif were employed to identify 4746 motifs with significant similarity (p-value < 10-4) to the model-motif in the L. plantarum genome. Genome-wide transcription start site information deduced from whole genome tiling-array transcriptome datasets revealed 936 TSSs that were employed to validate the transcriptionally active fraction of these predicted promoters. In total, 578 predicted promoters were found in proximity (M-bM-^IM-$ 100 nucleotides) of the identified TSSs, showing a highly significant co-occurrence of predicted promoter and measured TSS (p-value < 10-23). An additional 224 predicted promoters was validated when the significant similarity to the model-motif was applied less strictly (10-4 M-bM-^IM-$ p-value < 10-3). Conclusions: High-resolution tiling arrays provide a suitable source for TSS detection at a genome-wide level, and allow experimental verification of in silico-predicted promoter sequence motifs. Triplicate hybridizations (technical replicates) of two L. plantarum WCFS1 samples (biological replicates) gathered from a fermentation run at OD600 of 1. Dye swap applied to one of the three technical replicates.
Project description:The sigmas subunit of Escherichia coli RNA polymerase holoenzyme (EsigmaS) is a key factor of gene expression upon entry into stationary phase and in stressful conditions. The selectivity of promoter recognition by EsigmaS and the housekeeping Esigma70 is as yet not clearly understood. We used a genetic approach to investigate the interaction of sigmaS with its target promoters. Starting with down-promoter variants of a sigmaS promoter target, osmEp, altered in the -10 or -35 elements, we isolated mutant forms of sigmaS suppressing the promoter defects. The activity of these suppressors on variants of osmEp and ficp, another target of sigmaS, indicated that sigmaS is able to interact with the same key features within a promoter sequence as sigma70. Indeed, (i) sigmaS can recognize the -35 element of some but not all its target promoters, through interactions with its 4.2 region; and (ii) amino acids within the 2.4 region participate in the recognition of the -10 element. More specifically, residues Q152 and E155 contribute to the strong preference of sigmaS for a C in position -13 and residue R299 can interact with the -31 nucleotide in the -35 element of the target promoters.
Project description:Thermus thermophilus sigma(E), an extracytoplasmic function sigma factor from the extremely thermophilic bacterium Thermus thermophilus HB8, bound to the RNA polymerase core enzyme and showed transcriptional activity. With the combination of in vitro transcription assay and GeneChip technology, we identified three promoters recognized by sigma(E). The predicted consensus promoter sequence for sigma(E) is 5'-CA(A/T)(A/C)C(A/C)-N(15)-CCGTA-3'.
Project description:The sigmaS (or RpoS) subunit of RNA polymerase is the master regulator of the general stress response in Escherichia coli. While nearly absent in rapidly growing cells, sigmaS is strongly induced during entry into stationary phase and/or many other stress conditions and is essential for the expression of multiple stress resistances. Genome-wide expression profiling data presented here indicate that up to 10% of the E. coli genes are under direct or indirect control of sigmaS and that sigmaS should be considered a second vegetative sigma factor with a major impact not only on stress tolerance but on the entire cell physiology under nonoptimal growth conditions. This large data set allowed us to unequivocally identify a sigmaS consensus promoter in silico. Moreover, our results suggest that sigmaS-dependent genes represent a regulatory network with complex internal control (as exemplified by the acid resistance genes). This network also exhibits extensive regulatory overlaps with other global regulons (e.g., the cyclic AMP receptor protein regulon). In addition, the global regulatory protein Lrp was found to affect sigmaS and/or sigma70 selectivity of many promoters. These observations indicate that certain modules of the sigmaS-dependent general stress response can be temporarily recruited by stress-specific regulons, which are controlled by other stress-responsive regulators that act together with sigma70 RNA polymerase. Thus, not only the expression of genes within a regulatory network but also the architecture of the network itself can be subject to regulation.
Project description:RybB is a small, Hfq-binding noncoding RNA originally identified in a screen of conserved intergenic regions in Escherichia coli. Fusions of the rybB promoter to lacZ were used to screen plasmid genomic libraries and genomic transposon mutants for regulators of rybB expression. A number of plasmids, including some carrying rybB, negatively regulated the fusion. An insertion in the rep helicase and one upstream of dnaK decreased expression of the fusion. Multicopy suppressors of these insertions led to identification of two plasmids that stimulated the fusion. One contained the gene for the response regulator OmpR; the second contained mipA, encoding a murein hydrolase. The involvement of MipA and OmpR in cell surface synthesis suggested that the rybB promoter might be dependent on sigma(E). The sequence upstream of the +1 of rybB contains a consensus sigma(E) promoter. The activity of rybB-lacZ was increased in cells lacking the RseA anti-sigma factor and when sigma(E) was overproduced from a heterologous promoter. The activity of rybB-lacZ and the detection of RybB were totally abolished in an rpoE-null strain. In vitro, sigma(E) efficiently transcribes from this promoter. Both a rybB mutation and an hfq mutation significantly increased expression of both rybB-lacZ and rpoE-lacZ fusions, consistent with negative regulation of the sigma(E) response by RybB and other small RNAs. Based on the plasmid screens, NsrR, a repressor sensitive to nitric oxide, was also found to negatively regulate sigma(E)-dependent promoters in an RseA-independent fashion.
Project description:The extracytoplasmic factor (ECF) sigma factor sigma(E) is one of the most studied sigma factors of Mycobacterium tuberculosis. It has been shown to be involved in virulence as well as in survival under conditions of high temperature, alkaline pH, and exposure to detergents and oxidative stress. Unlike many ECF sigma factors, sigma(E) does not directly regulate the transcription of its own gene. Two promoters have been identified upstream of the sigE gene; one is regulated by the two-component system MprAB, while the other has been shown to be sigma(H) dependent. In this paper, we further characterize the regulation of sigma(E) by identifying its anti-sigma factor and a previously unknown promoter. Finally, we show that sigE can be translated from three different translational start codons, depending on the promoter used. Taken together, our data demonstrate that sigma(E) not only is subjected to complex transcriptional regulation but is also controlled at the translational and posttranslational levels.