ABSTRACT: Transcription factors (TFs) are able to associate to their binding sites on DNA faster than the physical limit posed by diffusion. Such high association rates can be achieved by alternating between three-dimensional diffusion and one-dimensional sliding along the DNA chain, a mechanism-dubbed facilitated diffusion. By studying a collection of TF binding sites of Escherichia coli from the RegulonDB database and of Bacillus subtilis from DBTBS, we reveal a funnel in the binding energy landscape around the target sequences. We show that such a funnel is linked to the presence of gradients of AT in the base composition of the DNA region around the binding sites. An extensive computational study of the stochastic sliding process along the energetic landscapes obtained from the database shows that the funnel can significantly enhance the probability of TFs to find their target sequences when sliding in their proximity. We demonstrate that this enhancement leads to a speed-up of the association process.
Project description:We introduce macromolecular crowding quantitatively into the model for kinetics of gene regulation in Escherichia coli. We analyse and compute the specific-site searching time for 180 known transcription factors (TFs) regulating 1300 operons. The time is between 160 s (e.g. for SoxS Mw = 12.91 kDa) and 1550 s (e.g. for PepA6 of Mw = 329.28 kDa). Diffusion coefficients for one-dimensional sliding are between for large proteins up to for small monomers or dimers. Three-dimensional diffusion coefficients in the cytoplasm are 2 orders of magnitude larger than 1D sliding coefficients, nevertheless the sliding enhances the binding rates of TF to specific sites by 1-2 orders of magnitude. The latter effect is due to ubiquitous non-specific binding. We compare the model to experimental data for LacI repressor and find that non-specific binding of the protein to DNA is activation- and not diffusion-limited. We show that the target location rate by LacI repressor is optimized with respect to microscopic rate constant for association to non-specific sites on DNA. We analyse the effect of oligomerization of TFs and DNA looping effects on searching kinetics. We show that optimal searching strategy depends on TF abundance.
Project description:We show that nucleosomes exert a maximal amount of hindrance to the one-dimensional diffusion of transcription factors (TFs) when they are present between TFs and their cognate sites on DNA. The effective one-dimensional diffusion coefficient of TFs (χTF) decreases with a rise in the free-energy barrier (μNU) of the sliding of nucleosomes as χTF∝exp(-μNU). The average time (ηL) required by TFs to slide over L sites on DNA increases with μNU as ηL∝exp(μNU). When TFs move close to nucleosomes, then they exhibit typical subdiffusion. Nucleosomes can enhance the search dynamics of TFs when TFs are present between nucleosomes and TF binding sites. These results suggest that nucleosome-depleted regions around the cognate sites of TFs are mandatory for efficient site-specific binding of TFs. Remarkably, the genome-wide in vivo positioning pattern of TFs shows a maximum at their specific binding sites where the occupancy of nucleosomes shows a minimum. This could be a consequence of an increasing level of breathing dynamics of nucleosome cores and decreasing levels of fluctuations in the DNA binding domains of TFs as they move across TF binding sites. The dynamics of TFs becomes slow as they approach their cognate sites so that TFs form a tight site-specific complex, whereas the dynamics of nucleosomes becomes rapid so that they quickly pass through the cognate sites of TFs. Several in vivo data sets on the genome-wide positioning pattern of nucleosomes and TFs agree well with our arguments. The retarding effects of nucleosomes can be minimized when the degree of condensation of DNA is such that it can permit a jump size associated with the dynamics of TFs beyond ∼160-180 bp.
Project description:Chromatin endogenous cleavage (ChEC) uses fusion of a protein of interest to micrococcal nuclease (MNase) to target calcium-dependent cleavage to specific genomic loci in vivo. Here we report the combination of ChEC with high-throughput sequencing (ChEC-seq) to map budding yeast transcription factor (TF) binding. Temporal analysis of ChEC-seq data reveals two classes of sites for TFs, one displaying rapid cleavage at sites with robust consensus motifs and the second showing slow cleavage at largely unique sites with low-scoring motifs. Sites with high-scoring motifs also display asymmetric cleavage, indicating that ChEC-seq provides information on the directionality of TF-DNA interactions. Strikingly, similar DNA shape patterns are observed regardless of motif strength, indicating that the kinetics of ChEC-seq discriminates DNA recognition through sequence and/or shape. We propose that time-resolved ChEC-seq detects both high-affinity interactions of TFs with consensus motifs and sites preferentially sampled by TFs during diffusion and sliding.
Project description:Proper timing of gene expression requires that transcription factors (TFs) efficiently locate and bind their target sites within a genome. Theoretical studies have long proposed that one-dimensional sliding along DNA while simultaneously reading its sequence can accelerate TF's location of target sites. Sliding by prokaryotic and eukaryotic TFs were subsequently observed. More recent theoretical investigations have argued that simultaneous reading and sliding is not possible for TFs without their possessing at least two DNA-binding modes. The tumor suppressor p53 has been shown to slide on DNA, and recent experiments have offered structural and single molecule support for a two-mode model for the protein. If the model is applicable to p53, then the requirement that TFs be able to read while sliding implies that noncognate sites will affect p53's mobility on DNA, which will thus be generally sequence-dependent. Here, we confirm this prediction with single-molecule microscopy measurements of p53's local diffusivity on noncognate DNA. We show how a two-mode model accurately predicts the variation in local diffusivity, while a single-mode model does not. We further determine that the best model of sequence-specific binding energy includes terms for "hemi-specific" binding, with one dimer of tetrameric p53 binding specifically to a half-site and the other binding nonspecifically to noncognate DNA. Our work provides evidence that the recognition by p53 of its targets and the timing thereof can depend on its noncognate binding properties and its ability to change between multiple modes of binding, in addition to the much better-studied effects of cognate-site binding.
Project description:It is known that DNA-binding proteins can slide along the DNA helix while searching for specific binding sites, but their path of motion remains obscure. Do these proteins undergo simple one-dimensional (1D) translational diffusion, or do they rotate to maintain a specific orientation with respect to the DNA helix? We measured 1D diffusion constants as a function of protein size while maintaining the DNA-protein interface. Using bootstrap analysis of single-molecule diffusion data, we compared the results to theoretical predictions for pure translational motion and rotation-coupled sliding along the DNA. The data indicate that DNA-binding proteins undergo rotation-coupled sliding along the DNA helix and can be described by a model of diffusion along the DNA helix on a rugged free-energy landscape. A similar analysis including the 1D diffusion constants of eight proteins of varying size shows that rotation-coupled sliding is a general phenomenon. The average free-energy barrier for sliding along the DNA was 1.1 +/- 0.2 k(B)T. Such small barriers facilitate rapid search for binding sites.
Project description:Enhancer-binding pluripotency regulators (Sox2 and Oct4) play a seminal role in embryonic stem (ES) cell-specific gene regulation. Here, we combine in vivo and in vitro single-molecule imaging, transcription factor (TF) mutagenesis, and ChIP-exo mapping to determine how TFs dynamically search for and assemble on their cognate DNA target sites. We find that enhanceosome assembly is hierarchically ordered with kinetically favored Sox2 engaging the target DNA first, followed by assisted binding of Oct4. Sox2/Oct4 follow a trial-and-error sampling mechanism involving 84-97 events of 3D diffusion (3.3-3.7 s) interspersed with brief nonspecific collisions (0.75-0.9 s) before acquiring and dwelling at specific target DNA (12.0-14.6 s). Sox2 employs a 3D diffusion-dominated search mode facilitated by 1D sliding along open DNA to efficiently locate targets. Our findings also reveal fundamental aspects of gene and developmental regulation by fine-tuning TF dynamics and influence of the epigenome on target search parameters.
Project description:DNA binding proteins efficiently search for their cognitive sites on long genomic DNA by combining 3D diffusion and 1D diffusion (sliding) along the DNA. Recent experimental results and theoretical analyses revealed that the proteins show a rotation-coupled sliding along DNA helical pitch. Here, we performed Brownian dynamics simulations using newly developed coarse-grained protein and DNA models for evaluating how hydrodynamic interactions between the protein and DNA molecules, binding affinity of the protein to DNA, and DNA fluctuations affect the one dimensional diffusion of the protein on the DNA. Our results indicate that intermolecular hydrodynamic interactions reduce 1D diffusivity by 30%. On the other hand, structural fluctuations of DNA give rise to steric collisions between the CG-proteins and DNA, resulting in faster 1D sliding of the protein. Proteins with low binding affinities consistent with experimental estimates of non-specific DNA binding show hopping along the CG-DNA. This hopping significantly increases sliding speed. These simulation studies provide additional insights into the mechanism of how DNA binding proteins find their target sites on the genome.
Project description:A detailed description of the events ruling ligand/protein interaction and an accurate estimation of the drug affinity to its target is of great help in speeding drug discovery strategies. We have developed a metadynamics-based approach, named funnel metadynamics, that allows the ligand to enhance the sampling of the target binding sites and its solvated states. This method leads to an efficient characterization of the binding free-energy surface and an accurate calculation of the absolute protein-ligand binding free energy. We illustrate our protocol in two systems, benzamidine/trypsin and SC-558/cyclooxygenase 2. In both cases, the X-ray conformation has been found as the lowest free-energy pose, and the computed protein-ligand binding free energy in good agreement with experiments. Furthermore, funnel metadynamics unveils important information about the binding process, such as the presence of alternative binding modes and the role of waters. The results achieved at an affordable computational cost make funnel metadynamics a valuable method for drug discovery and for dealing with a variety of problems in chemistry, physics, and material science.
Project description:Chromatin endogenous cleavage (ChEC) uses fusion of a protein of interest to micrococcal nuclease (MNase) to target calcium-dependent cleavage to specific genomic loci in vivo. Here we report the combination of ChEC with high-throughput sequencing (ChEC-seq) to map budding yeast transcription factor (TF) binding. Temporal analysis of ChEC-seq data reveals two classes of sites for TFs, one displaying rapid cleavage at sites with robust consensus motifs and the second showing slow cleavage at largely unique sites with low-scoring motifs. Sites with high-scoring motifs also display asymmetric cleavage, indicating that ChEC-seq provides information on the directionality of TF-DNA interactions. Strikingly, similar DNA shape patterns are observed regardless of motif strength, indicating that the kinetics of ChEC-seq discriminates DNA recognition through sequence and/or shape. We propose that time-resolved ChEC-seq detects both high-affinity interactions of TFs with consensus motifs and sites preferentially sampled by TFs during diffusion and sliding. Overall design: We adapted ChEC to a genome-wide sequencing readout (ChEC-seq) to map the genome-wide distributions of the budding yeast transcription factors Abf1, Rap1 and Reb1 without the limitations associated with ChIP-based methods.
Project description:An important step in understanding gene regulation is to identify the DNA binding sites recognized by each transcription factor (TF). Conventional approaches to prediction of TF binding sites involve the definition of consensus sequences or position-specific weight matrices and rely on statistical analysis of DNA sequences of known binding sites. Here, we present a method called SiteSleuth in which DNA structure prediction, computational chemistry, and machine learning are applied to develop models for TF binding sites. In this approach, binary classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. These features are determined via molecular dynamics calculations in which we consider each base in different local neighborhoods. For each of 54 TFs in Escherichia coli, for which at least five DNA binding sites are documented in RegulonDB, the TF binding sites and portions of the non-coding genome sequence are mapped to feature vectors and used in training. According to cross-validation analysis and a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth outperforms three conventional approaches: Match, MATRIX SEARCH, and the method of Berg and von Hippel. SiteSleuth also outperforms QPMEME, a method similar to SiteSleuth in that it involves a learning algorithm. The main advantage of SiteSleuth is a lower false positive rate.