DNase I-hypersensitive exons colocalize with promoters and distal regulatory elements.
ABSTRACT: The precise splicing of genes confers an enormous transcriptional complexity to the human genome. The majority of gene splicing occurs cotranscriptionally, permitting epigenetic modifications to affect splicing outcomes. Here we show that select exonic regions are demarcated within the three-dimensional structure of the human genome. We identify a subset of exons that exhibit DNase I hypersensitivity and are accompanied by 'phantom' signals in chromatin immunoprecipitation and sequencing (ChIP-seq) that result from cross-linking with proximal promoter- or enhancer-bound factors. The capture of structural features by ChIP-seq is confirmed by chromatin interaction analysis that resolves local intragenic loops that fold exons close to cognate promoters while excluding intervening intronic sequences. These interactions of exons with promoters and enhancers are enriched for alternative splicing events, an effect reflected in cell type-specific periexonic DNase I hypersensitivity patterns. Collectively, our results connect local genome topography, chromatin structure and cis-regulatory landscapes with the generation of human transcriptional complexity by cotranscriptional splicing.
Project description:We describe protein interaction quantitation (PIQ), a computational method for modeling the magnitude and shape of genome-wide DNase I hypersensitivity profiles to identify transcription factor (TF) binding sites. Through the use of machine-learning techniques, PIQ identified binding sites for >700 TFs from one DNase I hypersensitivity analysis followed by sequencing (DNase-seq) experiment with accuracy comparable to that of chromatin immunoprecipitation followed by sequencing (ChIP-seq). We applied PIQ to analyze DNase-seq data from mouse embryonic stem cells differentiating into prepancreatic and intestinal endoderm. We identified 120 and experimentally validated eight 'pioneer' TF families that dynamically open chromatin. Four pioneer TF families only opened chromatin in one direction from their motifs. Furthermore, we identified 'settler' TFs whose genomic binding is principally governed by proximity to open chromatin. Our results support a model of hierarchical TF binding in which directional and nondirectional pioneer activity shapes the chromatin landscape for population by settler TFs.
Project description:Background: Next-generation sequencing coupled to chromatin immunoprecipitation (ChIP-seq), DNase I hypersensitivity (DNase-seq) and the transposase-accessible chromatin assay (ATAC-seq) has generated enormous amounts of data, markedly improved our understanding of the transcriptional and epigenetic control of gene expression. To take advantage of the availability of such datasets and provide clues on what factors, including transcription factors, epigenetic regulators and histone modifications, potentially regulates the expression of a gene of interest, a tool for simultaneous queries of multiple datasets using symbols or genomic coordinates as search terms is needed. Results: In this study, we annotated the peaks of thousands of ChIP-seq datasets generated by ENCODE project, or ChIP-seq/DNase-seq/ATAC-seq datasets deposited in Gene Expression Omnibus (GEO) and curated by Cistrome project; We built a MySQL database called TFmapper containing the annotations and associated metadata, allowing users without bioinformatics expertise to search across thousands of datasets to identify factors targeting a genomic region/gene of interest in a specified sample through a web interface. Users can also visualize multiple peaks in genome browsers and download the corresponding sequences. Conclusion: TFmapper will help users explore the vast amount of publicly available ChIP-seq/DNase-seq/ATAC-seq data and perform integrative analyses to understand the regulation of a gene of interest. The web server is freely accessible at http://www.tfmapper.org/.
Project description:BACKGROUND:Regulation of transcription depends on interactions between cis-regulatory elements (CREs) and regulatory proteins. Active CREs are imbedded in open chromatin that are accessible to nucleases. Several techniques, including DNase-seq, which is based on nuclease DNase I, and ATAC-seq, which is based on transposase Tn5, have been widely used to identify genomic regions associated with open chromatin. These techniques have played a key role in dissecting the regulatory networks in gene expression in both animal and plant species. RESULTS:We develop a technique, named MNase hypersensitivity sequencing (MH-seq), to identify genomic regions associated with open chromatin in Arabidopsis thaliana. Genomic regions enriched with MH-seq reads are referred as MNase hypersensitive sites (MHSs). MHSs overlap with the majority (~?90%) of the open chromatin identified previously by DNase-seq and ATAC-seq. Surprisingly, 22% MHSs are not covered by DNase-seq or ATAC-seq reads, which are referred to "specific MHSs" (sMHSs). sMHSs tend to be located away from promoters, and a substantial portion of sMHSs are derived from transposable elements. Most interestingly, genomic regions containing sMHSs are enriched with epigenetic marks, including H3K27me3 and DNA methylation. In addition, sMHSs show a number of distinct characteristics including association with transcriptional repressors. Thus, sMHSs span distinct classes of open chromatin that may not be accessible to DNase I or Tn5. We hypothesize that the small size of the MNase enzyme relative to DNase I or Tn5 allows its access to relatively more condensed chromatin domains. CONCLUSION:MNase can be used to identify open chromatin regions that are not accessible to DNase I or Tn5. Thus, MH-seq provides an important tool to identify and catalog all classes of open chromatin in plants.
Project description:Much remains unknown concerning the mechanism by which the splicing machinery pinpoints short exons within intronic sequences and how splicing factors are directed to their pre-mRNA targets. Part of the explanation probably lies in differences in chromatin organization between exons and introns. Proteomic, co-immunoprecipitation, and sedimentation analyses described here indicated that SF3B1, an essential splicing component of the U2 snRNP complex, is strongly associated with nucleosomes. ChIP-seq and RNA-seq analyses revealed that SF3B1 is specifically bound to nucleosomes located at exonic positions. SF3B1 binding is enriched at nucleosomes positioned over short exons flanked by long introns that are also characterized by differential GC content between exons and introns. Disruption of SF3B1 binding to such nucleosomes affected the splicing of these exons similarly to inhibition of SF3B1 expression. Our findings suggest that the association of SF3B1 with nucleosomes is functionally important for splice site recognition and that SF3B1 conveys splicing-relevant information embedded in chromatin structure. MNase-seq on Input and SF3B1 pull-down, mRNA-seq on control and SF3B1 si-RNA treated cells as well as on TSA (Trichostatin A) treated and untreated cells.
Project description:Recent ChIP experiments indicate that spliceosome assembly and splicing can occur cotranscriptionally in S. cerevisiae. However, only a few genes have been examined, and all have long second exons. To extend these studies, we analyzed intron-containing genes with different second exon lengths by using ChIP as well as whole-genome tiling arrays (ChIP-CHIP). The data indicate that U1 snRNP recruitment is independent of exon length. Recursive splicing constructs, which uncouple U1 recruitment from transcription, suggest that cotranscriptional U1 recruitment contributes to optimal splicing efficiency. In contrast, U2 snRNP recruitment, as well as cotranscriptional splicing, is deficient on short second exon genes. We estimate that > or =90% of endogenous yeast splicing is posttranscriptional, consistent with an analysis of posttranscriptional snRNP-associated pre-mRNA.
Project description:Recent studies have provided strong evidence for a regulatory link among chromatin structure, histone modification, and splicing regulation. However, it is largely unknown how local histone modification patterns surrounding alternative exons are connected to differential alternative splicing outcomes. Here we show that splicing regulator Hu proteins can induce local histone hyperacetylation by association with their target sequences on the pre-mRNA surrounding alternative exons of two different genes. In both primary and mouse embryonic stem cell-derived neurons, histone hyperacetylation leads to an increased local transcriptional elongation rate and decreased inclusion of these exons. Furthermore, we demonstrate that Hu proteins interact with histone deacetylase 2 and inhibit its deacetylation activity. We propose that splicing regulators may actively modulate chromatin structure when recruited to their target RNA sequences cotranscriptionally. This "reaching back" interaction with chromatin provides a means to ensure accurate and efficient regulation of alternative splicing.
Project description:Gene regulation during the process of osteoblastogenesis has been well-described, yet the discovery of novel regulatory regions has been limited by how we currently predict the locations of functional cis-regulatory modules. Historically, the de novo identification of sequences critical for the control of gene expression relied primarily on sequence conservation in promoters. Queries for binding motifs were based on position weight matrices of known transcription factors and the identification of disease-causing, non-coding mutations near critical genes. However, we now must consider that regulatory elements also rely on 3-dimensional chromosomal interactions between far-distal regions, epigenetic chromosomal modifications, and RNA:DNA interactions. Traditionally, DNaseI-hypersensitivity assays have been used for the identification of regulatory regions via preferential digestion at chromatin depleted or displaced of nucleosomes, as a result of transcription factor occupancy. We probed DNase hypersensitivity on a genome-wide scale to determine whether osteogenic differentiation and/or bone-related gene regulation is marked by the presence of commonly utilized DNA motifs within active cis-regulatory modules. We thus sought to evaluate the gain or loss of motif representation within hypersensitive regions during osteoblastogenesis, from day-0 (growth-phase) to day-28 (mineralizing) MC3T3 cultures. We find that differentiation is marked by an increased enrichment of NFkB-p65, MEF2, and bHLH/E-box motifs within hypersensitive regions, while CTCF, NF1, TEAD, and AP1 motifs decrease. Furthermore, grouping hypersensitive regions based on genomic positioning (promoters, introns, exons, and far-distal regions) reveals significant differences in motif abundance in first introns versus other genomic positions. This finding suggests that the regulation conferred within first intron sequences may be somewhat distinct. Interestingly, the majority of motifs that were enriched, regardless of genomic position or differentiation time-point, were not completely matched to currently known transcription factor motifs (curated in the JASPAR database). Taken together, the changes in DNase-hypersensitive regions during osteoblastogenesis and the enrichment of distinct motifs within these regions indicate that osteoblasts utilize unique sets of motif rules for transcription factor binding or that regulatory control operates through undiscovered factors. Genome-wide DNase hypersensitivity mapping of osteoblast cultures was performed by adapting the DNase-seq protocol from Song et al. (Song and Crawford, 2010) with slight modifications. Growth-phase (day 0), matrix-deposition stage (day 9), or mineralization stage (day 28) MC3T3-E1 clone-4 cultures were subjected to DNase-seq library preparation. Libraries of purified DNA were generated using custom adapters described in Song et al. High-throughput sequencing was performed by Illumina Genome Analyzer II with 36 base reads and on an Illumina Hiseq-1000 with 100 base reads. Base calls and sequence reads were generated by Illumina CASAVA software (version 1.6, Illumina). Two independent biological repeats of DNase-seq libraries were prepared for each time point. Each biological repeat is represented by two technical repeats.
Project description:Splicing of mRNA precursors can occur cotranscriptionally and it has been proposed that chromatin structure influences splice site recognition and regulation. Here we have systematically explored potential links between nucleosome positioning and alternative splicing regulation upon progesterone stimulation of breast cancer cells. We confirm preferential nucleosome positioning in exons and report four distinct profiles of nucleosome density around alternatively spliced exons, with RNA polymerase II accumulation closely following nucleosome positioning. Hormone stimulation induces switches between profile classes, correlating with a subset of alternative splicing changes. Hormone-induced exon inclusion often correlates with higher nucleosome occupancy at the exon or the preceding intronic region and with higher RNA polymerase II accumulation. In contrast, exons skipped upon hormone stimulation display low nucleosome densities even before hormone treatment, suggesting that chromatin structure primes alternative splicing regulation. Skipped exons frequently harbor binding sites for hnRNP AB, a hormone-induced splicing regulator whose knock down prevents some hormone-induced skipping events. Collectively, our results argue that a variety of chromatin architecture mechanisms can influence alternative splicing decisions.
Project description:We have characterized the gene for the mouse glucocorticoid receptor. The gene spans approximately 110 kilobases, and glucocorticoid receptor transcripts are assembled from nine exons. Expression of the gene is controlled by at least three promoters, resulting in glucocorticoid receptor transcripts with different 5' nontranslated exons. One promoter is cell-specific, found to be active only in T lymphocytes. The other two promoters are active to various degrees in all cell lines and tissues so far analyzed and are located in a CpG island. The promoter activities are accompanied by DNase I hypersensitivity sites in chromatin. In contrast to a conservation of exon-intron structure, differences in promoter organization suggest a divergence between the evolution of regulatory and coding regions among members of the steroid receptor super-family.
Project description:DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB-seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB-seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF-bound and HSF-free DNA, and then detecting HSF-bound DNA by high-throughput sequencing. We compared PB-seq binding profiles with ones observed in vivo by ChIP-seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase-seq data and the ChIP-chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity.