Reprogramming of 3' untranslated regions of mRNAs by alternative polyadenylation in generation of pluripotent stem cells from different cell types.
ABSTRACT: The 3' untranslated regions (3'UTRs) of mRNAs contain cis elements involved in post-transcriptional regulation of gene expression. Over half of all mammalian genes contain multiple polyadenylation sites that lead to different 3'UTRs for a gene. Studies have shown that the alternative polyadenylation (APA) pattern varies across tissues, and is dynamically regulated in proliferating or differentiating cells. Generation of induced pluripotent stem (iPS) cells, in which differentiated cells are reprogrammed to an embryonic stem (ES) cell-like state, has been intensively studied in recent years. However, it is not known how 3'UTRs are regulated during cell reprogramming.Using a computational method that robustly examines APA across DNA microarray data sets, we analyzed 3'UTR dynamics in generation of iPS cells from different cell types. We found that 3'UTRs shorten during reprogramming of somatic cells, the extent of which depends on the type of source cell. By contrast, reprogramming of spermatogonial cells involves 3'UTR lengthening. The alternative polyadenylation sites that are highly responsive to change of cell state in generation of iPS cells are also highly regulated during embryonic development in opposite directions. Compared with other sites, they are more conserved, can lead to longer alternative 3'UTRs, and are associated with more cis elements for polyadenylation. Consistently, reprogramming of somatic cells and germ cells involves significant upregulation and downregulation, respectively, of mRNAs encoding polyadenylation factors, and RNA processing is one of the most significantly regulated biological processes during cell reprogramming. Furthermore, genes containing target sites of ES cell-specific microRNAs (miRNAs) in different portions of 3'UTR are distinctively regulated during cell reprogramming, suggesting impact of APA on miRNA targeting.Taken together, these findings indicate that reprogramming of 3'UTRs by APA, which result from regulation of both general polyadenylation activity and cell type-specific factors and can reset post-transcriptional gene regulatory programs in the cell, is an integral part of iPS cell generation, and the APA pattern can be a good biomarker for cell type and state, useful for sample classification. The results also suggest that perturbation of the mRNA polyadenylation machinery or RNA processing activity may facilitate generation of iPS cells.
Project description:Background:Most eukaryotic protein-coding genes exhibit alternative cleavage and polyadenylation (APA), resulting in mRNA isoforms with different 3' untranslated regions (3' UTRs). Studies have shown that brain cells tend to express long 3' UTR isoforms using distal cleavage and polyadenylation sites (PASs). Methods:Using our recently developed, comprehensive PAS database PolyA_DB, we developed an efficient method to examine APA, named Significance Analysis of Alternative Polyadenylation using RNA-seq (SAAP-RS). We applied this method to study APA in brain cells and neurogenesis. Results:We found that neurons globally express longer 3' UTRs than other cell types in brain, and microglia and endothelial cells express substantially shorter 3' UTRs. We show that the 3' UTR diversity across brain cells can be corroborated with single cell sequencing data. Further analysis of APA regulation of 3' UTRs during differentiation of embryonic stem cells into neurons indicates that a large fraction of the APA events regulated in neurogenesis are similarly modulated in myogenesis, but to a much greater extent. Conclusion:Together, our data delineate APA profiles in different brain cells and indicate that APA regulation in neurogenesis is largely an augmented process taking place in other types of cell differentiation.
Project description:The 3' UTR (UTR) of human mRNAs plays a critical role in controlling protein expression and function. Importantly, 3' UTRs of human messages are not invariant for each gene but rather are shaped by alternative polyadenylation (APA) in a cell state-dependent manner, including in response to T cell activation. However, the proteins and mechanisms driving APA regulation remain poorly understood. Here we show that the RNA-binding protein CELF2 controls APA of its own message in a signal-dependent manner by competing with core enhancers of the polyadenylation machinery for binding to RNA. We further show that CELF2 binding overlaps with APA enhancers transcriptome-wide, and almost half of 3' UTRs that undergo T cell signaling-induced APA are regulated in a CELF2-dependent manner. These studies thus reveal CELF2 to be a critical regulator of 3' UTR identity in T cells and demonstrate an additional mechanism for CELF2 in regulating polyadenylation site choice.
Project description:Tandem 3' UTRs produced by alternative polyadenylation (APA) play an important role in gene expression by impacting mRNA stability, translation, and translocation in cells. Several studies have investigated APA site switching in various physiological states; nevertheless, they only focused on either the genes with two known APA sites or several candidate genes. Here, we developed a strategy to study APA sites in a genome-wide fashion with second-generation sequencing technology which could not only identify new polyadenylation sites but also analyze the APA site switching of all genes, especially those with more than two APA sites. We used this strategy to explore the profiling of APA sites in two human breast cancer cell lines, MCF7 and MB231, and one cultured mammary epithelial cell line, MCF10A. More than half of the identified polyadenylation sites are not included in human poly(A) databases. While MCF7 showed shortening 3' UTRs, more genes in MB231 switched to distal poly(A) sites. Several gene ontology (GO) terms and pathways were enriched in the list of genes with switched APA sites, including cell cycle, apoptosis, and metabolism. These results suggest a more complex regulation of APA sites in cancer cells than previously thought. In short, our novel unbiased method can be a powerful approach to cost-effectively investigate the complex mechanism of 3' UTR switching in a genome-wide fashion among various physiological processes and diseases.
Project description:Most eukaryotic genes express alternative polyadenylation (APA) isoforms with different 3'UTR lengths, production of which is influenced by cellular conditions. Here, we show that arsenic stress elicits global shortening of 3'UTRs through preferential usage of proximal polyadenylation sites during stress and enhanced degradation of long 3'UTR isoforms during recovery. We demonstrate that RNA-binding protein TIA1 preferentially interacts with alternative 3'UTR sequences through U-rich motifs, correlating with stress granule association and mRNA decay of long 3'UTR isoforms. By contrast, genes with shortened 3'UTRs due to stress-induced APA can evade mRNA clearance and maintain transcript abundance post stress. Furthermore, we show that stress causes distinct 3'UTR size changes in proliferating and differentiated cells, highlighting its context-specific impacts on the 3'UTR landscape. Together, our data reveal a global, 3'UTR-based mRNA stability control in stressed cells and indicate that APA can function as an adaptive mechanism to preserve mRNAs in response to stress.
Project description:During pre-mRNA maturation 3' end processing can occur at different polyadenylation sites in the 3' untranslated region (3' UTR) to give rise to transcript isoforms that differ in the length of their 3' UTRs. Longer 3' UTRs contain additional <i>cis</i>-regulatory elements that impact the fate of the transcript and/or of the resulting protein. Extensive alternative polyadenylation (APA) has been observed in cancers, but the mechanisms and roles remain elusive. In particular, it is unclear whether the APA occurs in the malignant cells or in other cell types that infiltrate the tumor. To resolve this, we developed a computational method, called SCUREL, that quantifies changes in 3' UTR length between groups of cells, including cells of the same type originating from tumor and control tissue. We used this method to study APA in human lung adenocarcinoma (LUAD). SCUREL relies solely on annotated 3' UTRs and on control systems such as T cell activation, and spermatogenesis gives qualitatively similar results at much greater sensitivity compared to the previously published scAPA method. In the LUAD samples, we find a general trend toward 3' UTR shortening not only in cancer cells compared to the cell type of origin, but also when comparing other cell types from the tumor vs. the control tissue environment. However, we also find high variability in the individual targets between patients. The findings help in understanding the extent and impact of APA in LUAD, which may support improvements in diagnosis and treatment.
Project description:3' untranslated regions (3' UTRs) post-transcriptionally regulate mRNA stability, localization, and translation rate. While 3'-UTR isoforms have been globally quantified in limited cell types using bulk measurements, their differential usage among cell types during mammalian development remains poorly characterized. In this study, we examine a dataset comprising ~2 million nuclei spanning E9.5-E13.5 of mouse embryonic development to quantify transcriptome-wide changes in alternative polyadenylation (APA). We observe a global lengthening of 3' UTRs across embryonic stages in all cell types, although we detect shorter 3' UTRs in hematopoietic lineages and longer 3' UTRs in neuronal cell types within each stage. An analysis of RNA-binding protein (RBP) dynamics identifies ELAV-like family members, which are concomitantly induced in neuronal lineages and developmental stages experiencing 3'-UTR lengthening, as putative regulators of APA. By measuring 3'-UTR isoforms in an expansive single cell dataset, our work provides a transcriptome-wide and organism-wide map of the dynamic landscape of alternative polyadenylation during mammalian organogenesis.
Project description:The length of untranslated regions at the 3' end of transcripts (3'UTRs) is regulated by alternate polyadenylation (APA). 3'UTRs contain regions that harbor binding motifs for regulatory molecules. However, the mechanisms that coordinate the 3'UTR length of specific groups of transcripts are not well-understood. We therefore developed a method, CSI-UTR, that models 3'UTR structure as tandem segments between functional alternative-polyadenylation sites (termed cleavage site intervals-CSIs). This approach facilitated (1) profiling of 3'UTR isoform expression changes and (2) statistical enrichment of putative regulatory motifs. CSI-UTR analysis is UTR-annotation independent and can interrogate legacy data generated from standard RNA-Seq libraries. CSI-UTR identified a set of CSIs in human and rodent transcriptomes. Analysis of RNA-Seq datasets from neural tissue identified differential expression events within 3'UTRs not detected by standard gene-based differential expression analyses. Further, in many instances 3'UTR and CDS from the same gene were regulated differently. This modulation of motifs for RNA-interacting molecules with potential condition-dependent and tissue-specific RNA binding partners near the polyA signal and CSI junction may play a mechanistic role in the specificity of alternative polyadenylation. Source code, CSI BED files and example datasets are available at: https://github.com/UofLBioinformatics/CSI-UTR.
Project description:In eukaryotes, 3' untranslated regions (UTRs) play important roles in regulating posttranscriptional gene expression. The 3'UTR is defined by regulated cleavage/polyadenylation of the pre-mRNA. The advent of next-generation sequencing technology has now enabled us to identify these events on a genome-wide scale. In this study, we used poly(A)-position profiling by sequencing (3P-Seq) to capture all poly(A) sites across the genome of the freshwater planarian, Schmidtea mediterranea, an ideal model system for exploring the process of regeneration and stem cell function. We identified the 3'UTRs for ?14,000 transcripts and thus improved the existing gene annotations. We found 97 transcripts, which are polyadenylated within an internal exon, resulting in the shrinking of the ORF and loss of a predicted protein domain. Around 40% of the transcripts in planaria were alternatively polyadenylated (ApA), resulting either in an altered 3'UTR or a change in coding sequence. We identified specific ApA transcript isoforms that were subjected to miRNA mediated gene regulation using degradome sequencing. In this study, we also confirmed a tissue-specific expression pattern for alternate polyadenylated transcripts. The insights from this study highlight the potential role of ApA in regulating the gene expression essential for planarian regeneration.
Project description:Alternative polyadenylation (APA) in 3' untranslated regions (3' UTR) plays an important role in regulating transcript abundance, localization, and interaction with microRNAs. Length-variation of 3'UTRs by APA contributes to efficient proliferation of cancer cells. In this study, we investigated APA in single cancer cells and tumor microenvironment cells to understand the physiological implication of APA in different cell types. We analyzed APA patterns and the expression level of genes from the 515 single-cell RNA sequencing (scRNA-seq) dataset from 11 breast cancer patients. Although the overall 3'UTR length of individual genes was distributed equally in tumor and non-tumor cells, we found a differential pattern of polyadenylation in gene sets between tumor and non-tumor cells. In addition, we found a differential pattern of APA across tumor types using scRNA-seq data from 3 glioblastoma patients and 1 renal cell carcinoma patients. In detail, 1,176 gene sets and 53 genes showed the distinct pattern of 3'UTR shortening and over-expression as signatures for five cell types including B lymphocytes, T lymphocytes, myeloid cells, stromal cells, and breast cancer cells. Functional categories of gene sets for cellular proliferation demonstrated concordant regulation of APA and gene expression specific to cell types. The expression of APA genes in breast cancer was significantly correlated with the clinical outcome of earlier stage breast cancer patients. We identified cell type-specific APA in single cells, which allows the identification of cell types based on 3'UTR length variation in combination with gene expression. Specifically, an immune-specific APA signature in breast cancer could be utilized as a prognostic marker of early stage breast cancer.
Project description:<h4>Background</h4>Lung cancer is the second most common cancer with an extremely poor overall survival rate. Post-transcriptional regulation of gene expression play many important roles in human cancer, and one of the potential mechanisms underlying this is alternative mRNA maturation at its 3' untranslated regions (3'-UTRs).<h4>Methods</h4>Cancer tissues and paired adjacent normal lung tissues from 26 patients diagnosed with non-small cell lung cancer (NSCLC) were analyzed by in vitro transcription-sequencing alternative polyadenylation sites (IVT-SAPAS). 41,773,101 reads in average were obtained from each paired sample. A potential regulation of Cleavage Stimulation Factor Subunit 2 (CSTF2) on 3'UTR length of genes was tested in H460 cells.<h4>Results</h4>1439 (10.26%) genes showed up-regulated expression and 1364 (9.72%) genes showed down-regulated expression in lung cancer tissue versus normal lung tissue, and shorten 3'UTR in cancer tissue was detected in cancer tissues collected from 96.2% (25/26) patients, indicating lung cancer tend to have shortened 3'UTRs of these identified genes. KEGG analysis showed 1855 genes with shorten 3'UTR were enriched in mTOR signaling, ubiquitin mediated proteolysis and RNA degradation. Knocking down CSTF2 expression in H460 cells results in 3'UTR elongation of genes that was identified to be with shortened length in cancer tissues.<h4>Conclusion</h4>Alternative polyadenylation (APA) site-switching of 3'UTRs is prevalent in NSCLC, and CSTF2 may serve as an oncogene regulates the 3'UTR length of cancer related genes in NSCLC.