Teasing apart translational and transcriptional components of stochastic variations in eukaryotic gene expression.
ABSTRACT: The intrinsic stochasticity of gene expression leads to cell-to-cell variations, noise, in protein abundance. Several processes, including transcription, translation, and degradation of mRNA and proteins, can contribute to these variations. Recent single cell analyses of gene expression in yeast have uncovered a general trend where expression noise scales with protein abundance. This trend is consistent with a stochastic model of gene expression where mRNA copy number follows the random birth and death process. However, some deviations from this basic trend have also been observed, prompting questions about the contribution of gene-specific features to such deviations. For example, recent studies have pointed to the TATA box as a sequence feature that can influence expression noise by facilitating expression bursts. Transcription-originated noise can be potentially further amplified in translation. Therefore, we asked the question of to what extent sequence features known or postulated to accompany translation efficiency can also be associated with increase in noise strength and, on average, how such increase compares to the amplification associated with the TATA box. Untangling different components of expression noise is highly nontrivial, as they may be gene or gene-module specific. In particular, focusing on codon usage as one of the sequence features associated with efficient translation, we found that ribosomal genes display a different relationship between expression noise and codon usage as compared to other genes. Within nonribosomal genes we found that sequence high codon usage is correlated with increased noise relative to the average noise of proteins with the same abundance. Interestingly, by projecting the data on a theoretical model of gene expression, we found that the amplification of noise strength associated with codon usage is comparable to that of the TATA box, suggesting that the effect of translation on noise in eukaryotic gene expression might be more prominent than previously appreciated.
Project description:Fluctuations in protein abundance among single cells are primarily due to the inherent stochasticity in transcription and translation processes, such stochasticity can often confer phenotypic heterogeneity among isogenic cells. It has been proposed that expression noise can be triggered as an adaptation to environmental stresses and genetic perturbations, and as a mechanism to facilitate gene expression evolution. Thus, elucidating the relationship between expression noise, measured at the single-cell level, and expression variation, measured on population of cells, can improve our understanding on the variability and evolvability of gene expression. Here, we showed that noise levels are significantly correlated with conditional expression variations. We further demonstrated that expression variations are highly predictive for noise level, especially in TATA-box containing genes. Our results suggest that expression variabilities can serve as a proxy for noise level, suggesting that these two properties share the same underlining mechanism, e.g. chromatin regulation. Our work paves the way for the study of stochastic noise in other single-cell organisms.
Project description:We propose a simple, sensitive measure of synonymous codon usage bias, the Relative Codon Adaptation Index (rCAI), as a way to discriminate better between highly biased and unbiased regions, compared with the widely used Codon Adaptation Index (CAI). CAI is a geometric mean of the relative usage of codons in a gene, and is calculated using the codon usage table trained with a set of highly expressed genes. In contrast, rCAI is computed by subtracting the background codon usage trained with two noncoding frames of highly expressed genes from the codon usage in the coding frame. rCAI has higher signal-to-noise ratio than CAI, considering that noncoding frames would not show codon bias. Translation efficiency and protein abundance correlates comparably or better with rCAI than CAI or other measures such as 'effective number of codons' and 'SCUMBLE offsets'. Within overlapping coding regions, one of the two coding frames dominates in codon usage bias according to rCAI. Presumably, rCAI could substitute CAI in diverse applications.
Project description:Synthetic gene networks can be used to control gene expression and cellular phenotypes in a variety of applications. In many instances, however, such networks can behave unreliably due to gene expression noise. Accordingly, there is a need to develop systematic means to tune gene expression noise, so that it can be suppressed in some cases and harnessed in others, e.g. in cellular differentiation to create population-wide heterogeneity. Here, we present a method for controlling noise in synthetic eukaryotic gene expression systems, utilizing reduction of noise levels by TATA box mutations and noise propagation in transcriptional cascades. Specifically, we introduce TATA box mutations into promoters driving TetR expression and show that these mutations can be used to effectively tune the noise of a target gene while decoupling it from the mean, with negligible effects on the dynamic range and basal expression. We apply mathematical and computational modeling to explain the experimentally observed effects of TATA box mutations. This work, which highlights some important aspects of noise propagation in gene regulatory cascades, has practical implications for implementing gene expression control in synthetic gene networks.
Project description:BACKGROUND: Natural selection has traditionally been understood as a force responsible for pushing genes to states of higher translational efficiency, whereas lower translational efficiency has been explained by neutral mutation and genetic drift. We looked for evidence of directional selection resulting in increased unpreferred codon usage (and presumably reduced translational efficiency) in three divergent clusters of eukaryotic genomes using a simple optimal-codon-based metric (Kp/Ku). RESULTS: Here we show that for some genes natural selection is indeed responsible for causing accelerated unpreferred codon substitution, and document the scope of this selection. In Cryptococcus and to a lesser extent Drosophila, we find many genes showing a statistically significant signal of selection for unpreferred codon usage in one or more lineages. We did not find evidence for this type of selection in Saccharomyces. The signal of positive selection observed from unpreferred synonymous codon substitutions is coincident in Cryptococcus and Drosophila with the distribution of upstream open reading frames (uORFs), another genic feature known to reduce translational efficiency. Functional enrichment analysis of genes exhibiting low Kp/Ku ratios reveals that genes in regulatory roles are particularly subject to this type of selection. CONCLUSION: Through genome-wide scans, we find recent selection for unpreferred codon usage at approximately 1% of genetic loci in a Cryptococcus and several genes in Drosophila. Unpreferred codons can impede translation efficiency, and we find that genes with translation-impeding uORFs are enriched for this selection signal. We find that regulatory genes are particularly likely to be subject to selection for unpreferred codon usage. Given that expression noise can propagate through regulatory cascades, and that low translational efficiency can reduce expression noise, this finding supports the hypothesis that translational efficiency may be suppressed in some cases to reduce stochastic noise in gene expression.
Project description:A key goal in molecular evolution is to extract mechanistic insights from signatures of selection. A case study is codon usage, where despite many recent advances and hypotheses, two longstanding problems remain: the relative contribution of selection and mutation in determining codon frequencies and the relative contribution of translational speed and accuracy to selection. The relevant targets of selection--the rate of translation and of mistranslation of a codon per unit time in the cell--can only be related to mechanistic properties of the translational apparatus if the number of transcripts per cell is known, requiring use of gene expression measurements. Perhaps surprisingly, different gene-expression data sets yield markedly different estimates of selection. We show that this is largely due to measurement noise, notably due to differences between studies rather than instrument error or biological variability. We develop an analytical framework that explicitly models noise in expression in the context of the population-genetic model. Estimates of mutation and selection strength in budding yeast produced by this method are robust to the expression data set used and are substantially higher than estimates using a noise-blind approach. We introduce per-gene selection estimates that correlate well with previous scoring systems, such as the codon adaptation index, while now carrying an evolutionary interpretation. On average, selection for codon usage in budding yeast is weak, yet our estimates show that genes range from virtually unselected to average per-codon selection coefficients above the inverse population size. Our analytical framework may be generally useful for distinguishing biological signals from measurement noise in other applications that depend upon measurements of gene expression.
Project description:Codon usage bias is a universal feature of eukaryotic and prokaryotic genomes and has been proposed to regulate translation efficiency, accuracy, and protein folding based on the assumption that codon usage affects translation dynamics. The roles of codon usage in translation, however, are not clear and have been challenged by recent ribosome profiling studies. Here we used a Neurospora cell-free translation system to directly monitor the velocity of mRNA translation. We demonstrated that the preferred codons enhance the rate of translation elongation, whereas non-optimal codons slow elongation. Codon usage also controls ribosome traffic on mRNA. These conclusions were supported by ribosome profiling results in vitro and in vivo with template mRNAs designed to increase the signal-to-noise ratio. Finally, we demonstrate that codon usage regulates protein function by affecting co-translational protein folding. These results resolve a long-standing fundamental question and suggest the existence of a codon usage code for protein folding.
Project description:BACKGROUND: Synonymous codon usage bias has typically been correlated with, and attributed to translational efficiency. However, there are other pressures on genomic sequence composition that can affect codon usage patterns such as mutational biases. This study provides an analysis of the codon usage patterns in Arabidopsis thaliana in relation to gene expression levels, codon volatility, mutational biases and selective pressures. RESULTS: We have performed synonymous codon usage and codon volatility analyses for all genes in the A. thaliana genome. In contrast to reports for species from other kingdoms, we find that neither codon usage nor volatility are correlated with selection pressure (as measured by dN/dS), nor with gene expression levels on a genome wide level. Our results show that codon volatility and usage are not synonymous, rather that they are correlated with the abundance of G and C at the third codon position (GC3). CONCLUSIONS: Our results indicate that while the A. thaliana genome shows evidence for synonymous codon usage bias, this is not related to the expression levels of its constituent genes. Neither codon volatility nor codon usage are correlated with expression levels or selective pressures but, because they are directly related to the composition of G and C at the third codon position, they are the result of mutational bias. Therefore, in A. thaliana codon volatility and usage do not result from selection for translation efficiency or protein functional shift as measured by positive selection.
Project description:Codon usage biases are found in all eukaryotic and prokaryotic genomes, and preferred codons are more frequently used in highly expressed genes. The effects of codon usage on gene expression were previously thought to be mainly mediated by its impacts on translation. Here, we show that codon usage strongly correlates with both protein and mRNA levels genome-wide in the filamentous fungus Neurospora Gene codon optimization also results in strong up-regulation of protein and RNA levels, suggesting that codon usage is an important determinant of gene expression. Surprisingly, we found that the impact of codon usage on gene expression results mainly from effects on transcription and is largely independent of mRNA translation and mRNA stability. Furthermore, we show that histone H3 lysine 9 trimethylation is one of the mechanisms responsible for the codon usage-mediated transcriptional silencing of some genes with nonoptimal codons. Together, these results uncovered an unexpected important role of codon usage in ORF sequences in determining transcription levels and suggest that codon biases are an adaptation of protein coding sequences to both transcription and translation machineries. Therefore, synonymous codons not only specify protein sequences and translation dynamics, but also help determine gene expression levels.
Project description:With respect to upstream regions of microRNA (miRNA) target sites located in protein coding sequences, experimental studies have suggested rare codons, rather than frequent codons, are important for miRNA function, because they slow down the local translational process. But, whether there is a trend of reduced translation efficiency near miRNA targets is still unknown. Using Arabidopsis thaliana, we perform genome-wide analysis of synonymous codon usage in upstream regions of miRNA target sites. At the whole genome level, we find no significant selection signals for decreased translational efficiency. However, the same genome analyses do show substantial variations of translation efficiency reduction among miRNA targets. We find that miRNA conservation level, gene codon usage bias, and the mechanism of miRNA action can account for the differences in translation efficiency. But gene's GC content, gene expression level, and miRNA target's conservation level have no effect on local translation efficiency of miRNA targets. Although local translation efficiency in the upstream region of miRNA targets is related to miRNA function in A. thaliana, the selection signal of rare codon usage in that region is weak. We propose some other biological factors are more important than local translation efficiency in miRNA action when miRNA targets are located in protein coding sequences.
Project description:The glucokinase gene is 15.5-kilobases long, appears to be present as a single copy, and contains 10 exons that range in size from 96 to 977 base pairs. The transcription start site was located 127 nucleotides upstream from the translation initiation codon. The 5' flanking DNA contains several regions similar to defined promoter elements. These include a probable "TATA box," an Sp1 binding site, and several elements related to liver-specific gene expression. In addition, we determined that transcription of the glucokinase gene increased at least 20-fold when diabetic rats were treated with insulin for 2 hr.