Project description:The 3' untranslated regions (3' UTRs) of mRNAs contain cis-acting elements for posttranscriptional regulation of gene expression. Here, we report that mouse genes tend to express mRNAs with longer 3' UTRs as embryonic development progresses. This global regulation is controlled by alternative polyadenylation and coordinates with initiation of organogenesis and aspects of embryonic development, including morphogenesis, differentiation, and proliferation. Using myogenesis of C2C12 myoblast cells as a model, we recapitulated this process in vitro and found that 3' UTR lengthening is likely caused by weakening of mRNA polyadenylation activity. Because alternative 3' UTR sequences are typically longer and have higher AU content than constitutive ones, our results suggest that lengthening of 3' UTR can significantly augment posttranscriptional control of gene expression during embryonic development, such as microRNA-mediated regulation.
Project description:Post-transcriptional regulation, often mediated by miRNAs and RNA-binding proteins at the 3' untranslated regions (UTRs) of mRNAs, is implicated in important roles in the output of transcriptome. To decipher this layer of gene regulation, it is essential to measure global mRNA expression quantitatively in a 3'-UTR-specific manner. Here we establish an experimental and bioinformatics pipeline that simultaneously determines 3'-end formation by leveraging local nucleotide composition and quantitatively measures mRNA expression by sequencing polyadenylated transcripts. When applied to purified mouse embryonic skin stem cells and their daughter lineages, we identify 18,060 3' UTRs representing 12,739 distinct mRNAs that are abundantly expressed in the skin. We determine that ∼78% of UTRs are formed by using canonical A[A/U]UAAA polyadenylation signals, whereas ∼22% of UTRs use alternative signals. By comparing to relative and absolute mRNA abundance determined by qPCR, our RNA-seq approach can precisely measure mRNA fold-change and accurately determine the expression of mRNAs over four orders of magnitude. Surprisingly, only 829 out of 12,739 genes show differential 3'-end usage between embryonic skin stem cells and their immediate daughter cells, whereas the numbers increase to 933 genes when comparing embryonic skin stem cells with the more remotely related hair follicle cells. This suggests an evolving diversity instead of switch-like dynamics in 3'-end formation during development. Finally, core components of the miRNA pathway including Dicer, Dgcr8, Xpo5, and Argonautes show dynamic 3'-UTR formation patterns, indicating a self-regulatory mechanism. Together, our quantitative analysis reveals a dynamic picture of mRNA 3'-end formation in tissue stem cell lineages in vivo.
Project description:Cell-to-cell variability in gene expression is important for many processes in biology, including embryonic development and stem cell homeostasis. While heterogeneity of gene expression levels has been extensively studied, less attention has been paid to mRNA polyadenylation isoform choice. 3' untranslated regions regulate mRNA fate, and their choice is tightly controlled during development, but how 3' isoform usage varies within genetically and developmentally homogeneous cell populations has not been explored. Here, we perform genome-wide quantification of polyadenylation site usage in single mouse embryonic and neural stem cells using a novel single-cell transcriptomic method, BATSeq. By applying BATBayes, a statistical framework for analyzing single-cell isoform data, we find that while the developmental state of the cell globally determines isoform usage, single cells from the same state differ in the choice of isoforms. Notably this variation exceeds random selection with equal preference in all cells, a finding that was confirmed by RNA FISH data. Variability in 3' isoform choice has potential implications on functional cell-to-cell heterogeneity as well as utility in resolving cell populations.
Project description:The 3' untranslated regions (3'UTRs) of mRNAs contain cis elements involved in post-transcriptional regulation of gene expression. Over half of all mammalian genes contain multiple polyadenylation sites that lead to different 3'UTRs for a gene. Studies have shown that the alternative polyadenylation (APA) pattern varies across tissues, and is dynamically regulated in proliferating or differentiating cells. Generation of induced pluripotent stem (iPS) cells, in which differentiated cells are reprogrammed to an embryonic stem (ES) cell-like state, has been intensively studied in recent years. However, it is not known how 3'UTRs are regulated during cell reprogramming.Using a computational method that robustly examines APA across DNA microarray data sets, we analyzed 3'UTR dynamics in generation of iPS cells from different cell types. We found that 3'UTRs shorten during reprogramming of somatic cells, the extent of which depends on the type of source cell. By contrast, reprogramming of spermatogonial cells involves 3'UTR lengthening. The alternative polyadenylation sites that are highly responsive to change of cell state in generation of iPS cells are also highly regulated during embryonic development in opposite directions. Compared with other sites, they are more conserved, can lead to longer alternative 3'UTRs, and are associated with more cis elements for polyadenylation. Consistently, reprogramming of somatic cells and germ cells involves significant upregulation and downregulation, respectively, of mRNAs encoding polyadenylation factors, and RNA processing is one of the most significantly regulated biological processes during cell reprogramming. Furthermore, genes containing target sites of ES cell-specific microRNAs (miRNAs) in different portions of 3'UTR are distinctively regulated during cell reprogramming, suggesting impact of APA on miRNA targeting.Taken together, these findings indicate that reprogramming of 3'UTRs by APA, which result from regulation of both general polyadenylation activity and cell type-specific factors and can reset post-transcriptional gene regulatory programs in the cell, is an integral part of iPS cell generation, and the APA pattern can be a good biomarker for cell type and state, useful for sample classification. The results also suggest that perturbation of the mRNA polyadenylation machinery or RNA processing activity may facilitate generation of iPS cells.
Project description:A recently developed strategy of sequencing alternative polyadenylation (APA) sites (SAPAS) with second-generation sequencing technology can be used to explore complete genome-wide patterns of tandem APA sites and global gene expression profiles. spermatogonial stem cells (SSCs) maintain long-term reproductive abilities in male mammals. The detailed mechanisms by which SSCs self-renew and generate mature spermatozoa are not clear. To understand the specific alternative polyadenylation pattern and global gene expression profile of male germline stem cells (GSCs, mainly referred to SSCs here), we isolated and purified mouse Thy1+ cells from testis by magnetic-activated cell sorting (MACS) and then used the SAPAS method for analysis, using pluripotent embryonic stem cells (ESCs) and differentiated mouse embryonic fibroblast cells (MEFs) as controls. As a result, we obtained 99,944 poly(A) sites, approximately 40% of which were newly detected in our experiments. These poly(A) sites originated from three mouse cell types and covered 17,499 genes, including 831 long non-coding RNA (lncRNA) genes. We observed that GSCs tend to have shorter 3'UTR lengths while MEFs tend towards longer 3'UTR lengths. We also identified 1337 genes that were highly expressed in GSCs, and these genes were highly consistent with the functional characteristics of GSCs. Our detailed bioinformatics analysis identified APA site-switching events at 3'UTRs and many new specifically expressed genes in GSCs, which we experimentally confirmed. Furthermore, qRT-PCR was performed to validate several events of the 334 genes with distal-to-proximal poly(A) switch in GSCs. Consistently APA reporter assay confirmed the total 3'UTR shortening in GSCs compared to MEFs. We also analyzed the cis elements around the proximal poly(A) site preferentially used in GSCs and found C-rich elements may contribute to this regulation. Overall, our results identified the expression level and polyadenylation site profiles and these data provide new insights into the processes potentially involved in the GSC life cycle and spermatogenesis.
Project description:Background:Most eukaryotic protein-coding genes exhibit alternative cleavage and polyadenylation (APA), resulting in mRNA isoforms with different 3' untranslated regions (3' UTRs). Studies have shown that brain cells tend to express long 3' UTR isoforms using distal cleavage and polyadenylation sites (PASs). Methods:Using our recently developed, comprehensive PAS database PolyA_DB, we developed an efficient method to examine APA, named Significance Analysis of Alternative Polyadenylation using RNA-seq (SAAP-RS). We applied this method to study APA in brain cells and neurogenesis. Results:We found that neurons globally express longer 3' UTRs than other cell types in brain, and microglia and endothelial cells express substantially shorter 3' UTRs. We show that the 3' UTR diversity across brain cells can be corroborated with single cell sequencing data. Further analysis of APA regulation of 3' UTRs during differentiation of embryonic stem cells into neurons indicates that a large fraction of the APA events regulated in neurogenesis are similarly modulated in myogenesis, but to a much greater extent. Conclusion:Together, our data delineate APA profiles in different brain cells and indicate that APA regulation in neurogenesis is largely an augmented process taking place in other types of cell differentiation.
Project description:Messenger RNA polyadenylation is one of the key post-transcriptional events in eukaryotic cells. A large number of genes in mammalian species can undergo alternative polyadenylation, which leads to mRNAs with variable 3' ends. As the 3' end of mRNAs often contains cis elements important for mRNA stability, mRNA localization and translation, the implications of the regulation of polyadenylation can be multifold. Alternative polyadenylation is controlled by cis elements and trans factors, and is believed to occur in a tissue- or disease-specific manner. Given the availability of many databases devoted to other aspects of mRNA metabolism, such as transcriptional initiation and splicing, systematic information on polyadenylation, including alternative polyadenylation and its regulation, is noticeably lacking. Here, we present a database named polyA_DB, through which we strive to provide several types of information regarding polyadenylation in mammalian species: (i) polyadenylation sites and their locations with respect to the genomic structure of genes; (ii) cis elements surrounding polyadenylation sites; (iii) comparison of polyadenylation configuration between orthologous genes; and (iv) tissue/organ information for alternative polyadenylation sites. Currently, polyA_DB contains 45,565 polyadenylation sites for 25,097 human and mouse genes, representing the most comprehensive polyadenylation database till date. The database is accessible via the website (http://polya.umdnj.edu/polyadb).
Project description:mRNA polyadenylation is a critical cellular process in eukaryotes. It involves 3' end cleavage of nascent mRNAs and addition of the poly(A) tail, which plays important roles in many aspects of the cellular metabolism of mRNA. The process is controlled by various cis-acting elements surrounding the cleavage site, and their binding factors. In this study, we surveyed genome regions containing cleavage sites [herein called poly(A) sites], for 13,942 human and 11,155 mouse genes. We found that a great proportion of human and mouse genes have alternative polyadenylation ( approximately 54 and 32%, respectively). The conservation of alternative polyadenylation type or polyadenylation configuration between human and mouse orthologs is statistically significant, indicating that alternative polyadenylation is widely employed by these two species to produce alternative gene transcripts. Genes belonging to several functional groups, indicated by their Gene Ontology annotations, are biased with respect to polyadenylation configuration. Many poly(A) sites harbor multiple cleavage sites (51.25% human and 46.97% mouse sites), leading to heterogeneous 3' end formation for transcripts. This implies that the cleavage process of polyadenylation is largely imprecise. Different types of poly(A) sites, with regard to their relative locations in a gene, are found to have distinct nucleotide composition in surrounding genomic regions. This large-scale study provides important insights into the mechanism of polyadenylation in mammalian species and represents a genomic view of the regulation of gene expression by alternative polyadenylation.
Project description:Embryonic stem cells (ESCs) exhibit a unique cell cycle with a shortened G1 phase that supports their pluripotency, while apparently buffering them against pro-differentiation stimuli. In ESCs, expression of replication-dependent histones is a main component of this abbreviated G1 phase, although the details of this mechanism are not well understood. Similarly, the role of 3' end processing in regulation of ESC pluripotency and cell cycle is poorly understood. To better understand these processes, we examined mouse ESCs that lack the 3' end-processing factor CstF-64. These ESCs display slower growth, loss of pluripotency and a lengthened G1 phase, correlating with increased polyadenylation of histone mRNAs. Interestingly, these ESCs also express the ?CstF-64 paralog of CstF-64. However, ?CstF-64 only partially compensates for lost CstF-64 function, despite being recruited to the histone mRNA 3' end-processing complex. Reduction of ?CstF-64 in CstF-64-deficient ESCs results in even greater levels of histone mRNA polyadenylation, suggesting that both CstF-64 and ?CstF-64 function to inhibit polyadenylation of histone mRNAs. These results suggest that CstF-64 plays a key role in modulating the cell cycle in ESCs while simultaneously controlling histone mRNA 3' end processing.