Analysis of alternative cleavage and polyadenylation by 3' region extraction and deep sequencing
ABSTRACT: Alternative cleavage and polyadenylation (APA) generates diverse mRNA isoforms. We developed 3' region extraction and deep sequencing (3'READS) to address mispriming issues that commonly plague poly(A) site (pA) identification, and we used the method to comprehensively map pAs in the mouse genome. Thorough annotation of gene 3' ends revealed over 5,000 previously overlooked pAs (~8% of total) flanked by A-rich sequences, underscoring the necessity of using an accurate tool for pA mapping. About 79% of mRNA genes and 66% of long noncoding RNA genes undergo APA, but these two gene types have distinct usage patterns for pAs in introns and upstream exons. Quantitative analysis of APA isoforms by 3'READS indicated that promoter-distal pAs, regardless of intron or exon locations, become more abundant during embryonic development and cell differentiation and that upregulated isoforms have stronger pAs, suggesting global modulation of the 3' end–processing activity in development and differentiation. 3'READS to map pAs in mouse genome
Project description:Sequencing of the 3’ end of poly(A)+ RNA identifies cleavage and polyadenylation sites (pAs) and measures transcript expression. We previously developed a method, 3’ region extraction and deep sequencing (3’READS), to address mispriming issues that often plague 3’ end sequencing. Here we report a new version, named 3’READS+, which has vastly improved accuracy and sensitivity. Using a special locked nucleic acid oligo to capture poly(A)+ RNA and to remove bulk of the poly(A) tail, 3’READS+ generates RNA fragments with an optimal number of terminal As that balance data quality and detection of genuine pAs. With improved RNA ligation steps for efficiency, the method shows much higher sensitivity (over two orders of magnitude) compared to the previous version. Using 3’READS+, we have uncovered a sizable fraction of previously overlooked pAs located next to or within a stretch of adenylate residues in human genes, and more accurately assessed the frequency of alternative cleavage and polyadenylation (APA) in HeLa cells (~50%). 3’READS+ will be a useful tool to accurately study APA and to analyze gene expression by 3’ end counting, especially when the amount of input total RNA is limited. Nine 3'READS+ libraries were made with different amounts (100 ng, 200 ng, 400 ng, 1000 ng, 5000 ng, 15000 ng) of input Hela RNA.
Project description:Alternative polyadenylation (APA) of mRNAs has emerged as an important mechanism for post-transcriptional gene regulation in higher eukaryotes. Although microarrays have recently been used to characterize APA globally, they have a number of serious limitations that prevents comprehensive and highly quantitative analysis. To better characterize APA and its regulation, we have developed a deep sequencing-based method called Poly(A) Site Sequencing (PAS-Seq) for quantitatively profiling RNA polyadenylation at the transcriptome level. PAS-Seq not only accurately and comprehensively identifies poly(A) junctions in mRNAs and noncoding RNAs, but also provides quantitative information on the relative abundance of polyadenylated RNAs. We first analyzed HeLa cell transcriptome using PAS-Seq to demonstrate that PAS-Seq not only accurately and comprehensively identifies poly(A) junctions, but also provide quantitative information on the relativel abundance of APA isoforms. Next we analyzed the mouse embryonic stem cells, neural stem/progenitor cells, and neurons by PAS-Seq to characterize the dynamic changes of mRNA polyadenylation during stem cell differentiation.
Project description:Alternative cleavage and polyadenylation (APA) results in mRNA isoforms containing different 3’ untranslated regions (3’UTRs) and/or coding sequences. How core cleavage and polyadenylation (C/P) factors regulate APA is not well understood. Using siRNA knockdown coupled with deep sequencing, we found that several C/P factors can play significant roles in 3’UTR-APA. Whereas Pcf11 and Fip1 enhance usage of proximal poly(A) sites (pAs), CFI-25/68, PABPN1, and PABPC1 promote usage of distal pAs. Strong cis element biases were found for pAs regulated by CFI or Fip1, and the distance between pAs plays an important role in APA regulation. In addition, intronic pAs are substantially regulated by splicing factors, with U1 mostly influencing C/P events in 5’ introns and U2 impacting those in efficiently spliced introns. Furthermore, PABPN1 regulates expression of transcripts with pAs near the transcription start site, a property possibly related to its role in RNA degradation. Finally, we found that groups of APA events regulated by C/P factors are also modulated in cell differentiation and development with distinct trends. Together, our results indicate that the abundance of different C/P factors and splicing factors plays diverse roles in APA, and is relevant to APA regulation in biological conditions. knockdown experiments of 23 C/P factors, 3 splicing factors and U1D in mouse C2C12 myoblast cells
Project description:Most mammalian genes display alternative cleavage and polyadenylation (APA). Previous studies have indicated preferential expression of APA isoforms with short 3’UTRs in testes. Here we show widespread shortening of 3’UTR by APA during the first wave of spermatogenesis in mouse, with 3’UTRs being the shortest in spermatids. Shortening of 3’UTR eliminates destabilizing elements, such as U-rich elements and transposable elements, which appear to be highly potent for transcript elimination during spermatogenesis. We additionally found widespread regulation of APA in introns and global activation of upstream antisense transcripts during spermatogenesis. Interestingly, genes that display 3’UTR shortening tend to have higher levels of H3K4me3, consistent with the open chromatin feature previously observed in spermatids. Since genes with 3’UTR shortening tend to have functions important for further sperm development after spermatids, when transcription is halted, this result indicates that expression of short, stable mRNAs may serve the purpose of mRNA storage for later translation. Thus, APA in spermatogenesis connects regulation of chromatin status with post-transcriptional control, and impacts sperm maturation. 3'READS of 1 week to 6 week of testis development
Project description:RNA synthesis and decay rates determine the steady-state levels of cellular RNAs. Metabolic tagging of newly transcribed RNA by 4-thiouridine (4sU) can reveal the relative contributions of RNA synthesis and decay rates. The kinetics of RNA processing, however, so far remained unresolved. Here, we show that ultra-short 4sU-tagging not only provides snap-shot pictures of eukaryotic gene expression but, when combined with progressive 4sU-tagging and RNA-seq, reveals global RNA processing kinetics at nucleotide resolution. Using this method, we identified classes of rapidly and slowly spliced/degraded introns. Interestingly, each class of splicing kinetics was characterized by a distinct association with intron length, gene length and splice site strength. For a large group of introns, we also observed long lasting retention in the primary transcript, but efficient secondary splicing or degradation at later time points. Finally, we show that processing of most, but not all small nucleolar (sno)RNA-containing introns is remarkably inefficient with the majority of introns being spliced and degraded rather than processed into mature snoRNAs. In summary, our study yields unparalleled insights into the kinetics of RNA processing and provides the tools to study molecular mechanisms of RNA processing and their contribution to the regulation of gene expression. 4sU-tagging was performed in human DG75 B-cells by adding 500 uM 4sU to cell culture medium for 5, 10, 15, 20 or 60 min. Following isolation of total cellular RNA, this was separated into nascent and untagged, pre-existing RNA. Nascent RNA as well as total and untagged RNA from 60 min 4sU-tagging were subjected to SOLiD sequencing (SOLiD II) obtaining 35 nt reads.
Project description:Alternative polyadenylation (APA) of mRNAs has emerged as an important mechanism for post-transcriptional gene regulation in higher eukaryotes. Although microarrays have recently been used to characterize APA globally, they have a number of serious limitations that prevents comprehensive and highly quantitative analysis. To better characterize APA and its regulation, we have developed a deep sequencing-based method called Poly(A) Site Sequencing (PAS-Seq) for quantitatively profiling RNA polyadenylation at the transcriptome level. PAS-Seq not only accurately and comprehensively identifies poly(A) junctions in mRNAs and noncoding RNAs, but also provides quantitative information on the relative abundance of polyadenylated RNAs. Overall design: We first analyzed HeLa cell transcriptome using PAS-Seq to demonstrate that PAS-Seq not only accurately and comprehensively identifies poly(A) junctions, but also provide quantitative information on the relativel abundance of APA isoforms. Next we analyzed the mouse embryonic stem cells, neural stem/progenitor cells, and neurons by PAS-Seq to characterize the dynamic changes of mRNA polyadenylation during stem cell differentiation.
Project description:APA mechanisms in S. pombe and S. cerevisiae are largely different, distinctly impacting gene expression, antisense transcripts and 3’UTR evolution, and core processing factors regulate APA in a PAS-dependent manner. Overall design: APA mechanisms in S. pombe and S. cerevisiae
Project description:The PAF complex (Paf1C) has been shown to regulate chromatin modifications, gene transcription, and PolII elongation. Here, we provide the first genome-wide analysis of chromatin occupancy by the entire PAF complex in mammalian cells. We show that Paf1C is recruited not only to promoters and gene bodies, but also to regions downstream of cleavage/polyadenylation (pA) sites at 3’ ends, a profile that sharply contrasted with the yeast complex. Remarkably, our studies identified novel, subunit-specific links between Paf1C and regulation of alternative cleavage and polyadenylation (APA) and upstream antisense transcription. Moreover, we found that depletion of Paf1C subunits also resulted in the accumulation of RNA polymerase II (PolII) over gene bodies, which coincided with APA. Depletion of specific Paf1C subunits leads to global loss of histone H2B ubiquitylation, but surprisingly, there is little impact of Paf1C depletion on other histone modifications, including the tri-methylation of histone H3 on lysines 4 and 36 (H3K4me3 and H3K36me3), previously associated with this complex. Our results provide surprising differences with yeast, while unifying observations that link Paf1C with PolII elongation and RNA processing, and suggest that Paf1C could play a role in protecting transcripts from premature cleavage by preventing PolII accumulation at TSS-proximal pA sites. ChIP-seq, RNA-seq and 3'READS of Paf1C factors in mouse C2C12 myoblast cells
Project description:Both canonical and alternative splicing of RNAs is governed by intronic sequence elements and produces transient lariat structures fastened by branch-points within introns. To map precisely the location of branch-points on a genomic scale, we developed LaSSO (Lariat Sequence Site Origin), a data-driven algorithm which utilizes RNA-seq data. Using fission yeast cells lacking the debranching enzyme Dbr1, LaSSO not only accurately identified canonical splicing events, but also pinpointed novel, but rare, exon-skipping events, which may reflect aberrantly spliced transcripts. Compromised intron turnover perturbed gene regulation at multiple levels, including splicing and protein translation. Notably, Dbr1 function was also critical for the expression of mitochondrial genes, and for the processing of self-spliced mitochondrial introns. LaSSO showed better sensitivity and accuracy than algorithms used for computational branch-point prediction or for empirical branch-point determination. Even when applied to a human data set acquired in the presence of debranching activity, LaSSO identified both canonical and exon skipping branch-points. LaSSO thus provides an effective, accurate and unbiased approach for defining high-resolution maps of branch-site sequences and intronic elements on a genomic scale. LaSSO should be useful to validate introns and uncover branch-point sequences in any eukaryote, and it could be integrated to RNA-seq pipelines. Interrogation of the S. pombe transcriptome using rRNA depleted strand specific RNA sequencing (Illumina HiSeq 2000) in wild type and dbr1.Δ cultures. A total of 4 samples were analyzed: two biological repeates of wild-type strain and two biological repeats of dbr1.Δ
Project description:The primary structure and phosphorylation pattern of the tandem YSPTSPS repeats of the RNA polymerase II CTD comprise an informational code that coordinates transcription, chromatin modification, and RNA processing. To gauge the contributions of individual CTD coding “letters” to gene expression, we analyzed the poly(A)+ transcriptomes of fission yeast mutants that lack each of the four inessential CTD phosphoacceptors: Tyr1, Ser2, Thr4, and Ser7. There was a hierarchy of CTD mutational effects with respect to the number of dysregulated protein-coding RNAs, with S2A (n=227) >> Y1F (n=71) > S7A (n=58) >> T4A (n=7). The majority of the protein-coding RNAs affected in Y1F cells were coordinately affected by S2A, suggesting that Tyr1-Ser2 constitutes a two-letter code “word”. Y1F and S2A elicited increased expression of genes encoding proteins involved in iron uptake (Frp1, Fip1, Fio1, Str3, Str1, Sib1), without affecting the expression of the genes that repress the iron regulon, implying that Tyr1-Ser2 transduces a repressive signal. Y1F and S2A cells had increased levels of ferric reductase activity and were hypersensitive to phleomycin, indicative of elevated intracellular iron. The T4A and S7A mutations had opposing effects on the phosphate response pathway. T4A reduced the expression of two genes encoding proteins involved in phosphate acquisition (the Pho1 acid phosphatase and the phosphate transporter SPBC8E4.01c), without affecting the expression of known genes that regulate the phosphate response pathway, while S7A increased pho1+ expression. Meiotic genes were enriched among those up-regulated in S7A cells. These results highlight specific cellular gene expression programs that are responsive to distinct CTD cues. Interrogation of the S. pombe transcriptome using polyA+ strand specific RNA sequencing (Illumina HiSeq 2000) in cultures. A total of 16 samples were analysed: two biological repeates of each WT, Y1F, S2A, T4A, S7A,Y1F-S7A, S2A-S7A and T4A-S7A strains