Project description:Transcription is a highly dynamic process. Consequently, we have developed native elongating transcript sequencing technology for mammalian chromatin (mNET-seq), which generates single-nucleotide resolution, nascent transcription profiles. Nascent RNA was detected in the active site of RNA polymerase II (Pol II) along with associated RNA processing intermediates. In particular, we detected 5'splice site cleavage by the spliceosome, showing that cleaved upstream exon transcripts are associated with Pol II CTD phosphorylated on the serine 5 position (S5P), which is accumulated over downstream exons. Also, depletion of termination factors substantially reduces Pol II pausing at gene ends, leading to termination defects. Notably, termination factors play an additional promoter role by restricting non-productive RNA synthesis in a Pol II CTD S2P-specific manner. Our results suggest that CTD phosphorylation patterns established for yeast transcription are significantly different in mammals. Taken together, mNET-seq provides dynamic and detailed snapshots of the complex events underlying transcription in mammals.
Project description:The transcription cycle of RNA polymerase II (Pol II) correlates with changes to the phosphorylation state of its large subunit C-terminal domain (CTD). We recently developed Native Elongation Transcript sequencing using mammalian cells (mNET-seq), which generates single-nucleotide-resolution genome-wide profiles of nascent RNA and co-transcriptional RNA processing that are associated with different CTD phosphorylation states. Here we provide a detailed protocol for mNET-seq. First, Pol II elongation complexes are isolated with specific phospho-CTD antibodies from chromatin solubilized by micrococcal nuclease digestion. Next, RNA derived from within the Pol II complex is size fractionated and Illumina sequenced. Using mNET-seq, we have previously shown that Pol II pauses at both ends of protein-coding genes but with different CTD phosphorylation patterns, and we have also detected phosphorylation at serine 5 (Ser5-P) CTD-specific splicing intermediates and Pol II accumulation over co-transcriptionally spliced exons. With moderate biochemical and bioinformatic skills, mNET-seq can be completed in ∼6 d, not including sequencing and data analysis.
Project description:Transcription termination in bacteria can occur either via Rho-dependent or independent (intrinsic) mechanisms. Intrinsic terminators are composed of a stem-loop RNA structure followed by a uridine stretch and are known to terminate in a precise manner. In contrast, Rho-dependent terminators have more loosely defined characteristics and are thought to terminate in a diffuse manner. While transcripts ending in an intrinsic terminator are protected from 3'-5' exonuclease digestion due to the stem-loop structure of the terminator, it remains unclear what protects Rho-dependent transcripts from being degraded. In this study, we mapped the exact steady-state RNA 3' ends of hundreds of Escherichia coli genes terminated either by Rho-dependent or independent mechanisms. We found that transcripts generated from Rho-dependent termination have precise 3'-ends at steady state. These termini were localized immediately downstream of energetically stable stem-loop structures, which were not followed by uridine rich sequences. We provide evidence that these structures protect Rho-dependent transcripts from 3'-5' exonucleases such as PNPase and RNase II, and present data localizing the Rho-utilization (rut) sites immediately downstream of these protective structures. This study represents the first extensive in-vivo map of exact RNA 3'-ends of Rho-dependent transcripts in E. coli.
Project description:Recent discovery of the RNA/DNA hybrid G-quadruplexes (HQs) and their potential wide-spread occurrence in human genome during transcription have suggested a new and generic transcriptional control mechanism. The G-rich sequence in which HQ may form can coincide with that for DNA G-quadruplexes (GQs), which are well known to modulate transcriptions. Understanding the molecular interaction between HQ and GQ is, therefore, of pivotal importance to dissect the new mechanism for transcriptional regulation. Using a T7 transcription model, herein we found that GQ and HQ form in a natural sequence, (GGGGA)4, downstream of many transcription start sites. Using a newly-developed single-molecular stalled-transcription assay, we revealed that RNA transcripts helped to populate quadruplexes at the expense of duplexes. Among quadruplexes, HQ predominates GQ in population and mechanical stabilities, suggesting HQ may serve as a better mechanical block during transcription. The fact that HQ and GQ folded within tens of milliseconds in the presence of RNA transcripts provided justification for the co-transcriptional folding of these species. The catalytic role of RNA transcripts in the GQ formation was strongly suggested as the GQ folded >7 times slower without transcription. These results shed light on the possible synergistic effect of GQs and HQs on transcriptional controls.
Project description:Mature spermatozoa contain a whole repertoire of the various classes of cellular RNAs, both coding and non-coding. It was hypothesized that after fertilization they might impact development, a claim supported by experimental evidence in various systems. Despite the current increasing interest in the transgenerational maintenance of epigenetic traits and their possible determination by RNAs, little remains known about conservation in sperm and across generations and the specificities and mechanisms involved in transgenerational maintenance. We identified two distinct fractions of RNAs in mature mouse sperm, one readily extracted in the aqueous phase of the classical TRIzol procedure and a distinct fraction hybridized with homologous DNA in DNA-RNA complexes recovered from the interface, purified after DNase hydrolysis and analyzed by RNA-seq methodology. This DNA-associated RNA (D RNA) was found to represent as much as half of the cell contents in differentiated sperm, in which a major part of the cytoplasmic material has been discarded. Stable complexes were purified free of proteins and identified as hybrids (R-loops) on the basis of their sensitivity to RNase H hydrolysis. Further analysis by RNA-seq identified transcripts from all the coding and non-coding regions of the genome, thus revealing an extensive wave of transcription, prior to or concomitant with the terminal compaction of the chromatin.
Project description:Gene transcription is controlled and modulated by regulatory regions, including enhancers and promoters. These regions are abundant in unstable, non-coding bidirectional transcription. Using nascent RNA transcription data across hundreds of human samples, we identified over 800,000 regions containing bidirectional transcription. We then identify highly correlated transcription between bidirectional and gene regions. The identified correlated pairs, a bidirectional region and a gene, are enriched for disease associated SNPs and often supported by independent 3D data. We present these resources as an SQL database which serves as a resource for future studies into gene regulation, enhancer associated RNAs, and transcription factors.
Project description:Cellular protein-RNA complexes assemble on nascent transcripts, but methods to observe transcription and protein binding in real time and at physiological concentrations are not available. Here, we report a single-molecule approach based on zero-mode waveguides that simultaneously tracks transcription progress and the binding of ribosomal protein S15 to nascent RNA transcripts during early ribosome biogenesis. We observe stable binding of S15 to single RNAs immediately after transcription for the majority of the transcripts at 35 °C but for less than half at 20 °C. The remaining transcripts exhibit either rapid and transient binding or are unable to bind S15, likely due to RNA misfolding. Our work establishes the foundation for studying transcription and its coupled co-transcriptional processes, including RNA folding, ligand binding, and enzymatic activity such as in coupling of transcription to splicing, ribosome assembly or translation.
Project description:BackgroundEukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs originating from intergenic regions in the human genome remain unclear.ResultsWe hypothesize that many intergenic RNAs can be ascribed to the presence of as-yet unannotated genes or the "fuzzy" transcription of known genes that extends beyond the annotated boundaries. To elucidate the contributions of these two sources, we assemble a dataset of more than 2.5 billion publicly available RNA-seq reads across 5 human cell lines and multiple cellular compartments to annotate transcriptional units in the human genome. About 80% of transcripts from unannotated intergenic regions can be attributed to the fuzzy transcription of existing genes; the remaining transcripts originate mainly from putative long non-coding RNA loci that are rarely spliced. We validate the transcriptional activity of these intergenic RNAs using independent measurements, including transcriptional start sites, chromatin signatures, and genomic occupancies of RNA polymerase II in various phosphorylation states. We also analyze the nuclear localization and sensitivities of intergenic transcripts to nucleases to illustrate that they tend to be rapidly degraded either on-chromatin by XRN2 or off-chromatin by the exosome.ConclusionsWe provide a curated atlas of intergenic RNAs that distinguishes between alternative processing of well-annotated genes from independent transcriptional units based on the combined analysis of chromatin signatures, nuclear RNA localization, and degradation pathways.
Project description:Transcription termination in bacteria can occur either via Rho-dependent or independent (intrinsic) mechanisms. Intrinsic terminators are composed of a stem-loop RNA structure followed by a uridine stretch and are known to terminate in a precise manner. In contrast, Rho-dependent terminators have more loosely defined characteristics and are thought to terminate in a diffuse manner. While transcripts ending in an intrinsic terminator are protected from 3’-5’ exonuclease digestion due to the stem-loop structure of the terminator, it remains unclear what protects Rho-dependent transcripts from being degraded. In this study, we mapped the exact steady-state RNA 3’ ends of hundreds of E. coli genes terminated either by Rho-dependent or independent mechanisms. We found that transcripts generated from Rho-dependent termination have precise 3’-ends at steady state. These termini were localized immediately downstream of energetically stable stem-loop structures, which were not followed by uridine rich sequences. We provide evidence that these structures protect Rho-dependent transcripts from 3’-5’ exonucleases such as PNPase and RNase II, and present data localizing the Rho-utilization (rut) sites immediately downstream of these protective structures. This study represents the first extensive in-vivo map of exact RNA 3’-ends of Rho-dependent transcripts in E. coli.
Project description:We have generated single-nucleotide resolution, nascent transcription profiles from HeLa cells by developing Native Elongation Transcript sequencing technology for mammalian chromatin (mNET-seq). Our extensive data sets provide a substantial resource to study mammalian nascent transcript profiles. We reveal unanticipated phosphorylation states for RNA polymerase II C-terminal domain (Pol II CTD) at both gene ends. We also observe that following 5’ splice site cleavage by the spliceosome, upstream exon transcripts are tethered to Pol II CTD phosphorylated on the serine 5 position (S5P) which is accumulated over downstream exons. We further show that depletion of termination factors substantially reduces Pol II pausing at gene ends leading to termination defects. Remarkably termination factors play an additional promoter role by restricting non-productive RNA synthesis and redistributing Pol II CTD S2P to promoters. These data demonstrate that CTD phosphorylation is more dynamic and variably distributed across mammalian transcription units than previously envisaged. To monitor nascent RNA within the mammalian Pol II complex, and its association with different CTD phosphorylation states, we employed mNET-seq methodology on HeLa cells, complemented with direct sequencing of chromatin-bound RNA (ChrRNA-seq). mNET-seq was preformed using the antibodies 8WG16, CMA602, CMA603 and CMA601, which are specific for unphosphorylated CTD, Ser2 phosphorylation, Ser5 phosphorylation and all CTD isoforms, respectively. In another experiment, to evaluate the effect of transcription termination factors in nascent RNA production by Pol II, mNET-seq and complemented with ChrRNA-seq was preformed on HeLa cells transfected with siRNA against PTBP1, CPSF73, CstF64+CstF64tau or Xrn2, and the gene profiles were compared with profiles from HeLa transfected with siRNA for Luciferase generated by the same protocol.