PulseTD: RNA life cycle dynamics analysis based on pulse model of 4sU-seq time course sequencing data.
ABSTRACT: The life cycle of intracellular RNA mainly involves transcriptional production, splicing maturation and degradation processes. Their dynamic changes are termed as RNA life cycle dynamics (RLCD). It is still challenging for the accurate and robust identification of RLCD under unknow the functional form of RLCD. By using the pulse model, we developed an R package named pulseTD to identify RLCD by integrating 4sU-seq and RNA-seq data, and it provides flexible functions to capture continuous changes in RCLD rates. More importantly, it also can predict the trend of RNA transcription and expression changes in future time points. The pulseTD shows better accuracy and robustness than some other methods, and it is available on the GitHub repository (https://github.com/bioWzz/pulseTD_0.2.0).
Project description:High concentrations (> 100 µM) of the ribonucleoside analog 4-thiouridine (4sU) is widely used in methods for RNA analysis like photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) and nascent messenger (m)RNA labeling (4sU-tagging). Here, we show that 4sU-tagging at low concentrations ? 10 µM can be used to measure production and processing of ribosomal (r)RNA. However, elevated concentrations of 4sU (> 50 µM), which are usually used for mRNA labeling experiments, inhibit production and processing of 47S rRNA. The inhibition of rRNA synthesis is accompanied by nucleoplasmic translocation of nucleolar nucleophosmin (NPM1), induction of the tumor suppressor p53, and inhibition of proliferation. We conclude that metabolic labeling of RNA by 4sU triggers a nucleolar stress response, which might influence the interpretation of results. Therefore, functional ribosome biogenesis, nucleolar integrity, and cell cycle should be addressed in 4sU labeling experiments.
Project description:The brain is a highly complex organ consisting of numerous types of cells with ample diversity at the epigenetic level to achieve distinct gene expression profiles. During neuronal cell specification, transcription factors (TFs) form regulatory modules with chromatin remodeling proteins to initiate the cascade of epigenetic changes. Currently, little is known about brain epigenetic regulatory modules and how they regulate gene expression in a cell-type specific manner. To infer TFs involved in neuronal specification, we applied a recursive motif search approach on the differentially methylated regions identified from single-cell methylomes. The epigenetic transcription regulatory modules (ETRM), including EGR1 and MEF2C, were predicted and the co-expression of TFs in ETRMs were examined with RNA-seq data from single or sorted brain cells using a conditional probability matrix. Lastly, computational predications were validated with EGR1 ChIP-seq data. In addition, methylome and RNA-seq data generated from Egr1 knockout mice supported the essential role of EGR1 in brain epigenome programming, in particular for excitatory neurons. In summary, we demonstrated that brain single cell methylome and RNA-seq data can be integrated to gain a better understanding of how ETRMs control cell specification. The analytical pipeline implemented in this study is freely accessible in the Github repository (https://github.com/Gavin-Yinld/brain_TF).
Project description:RNA synthesis and decay rates determine the steady-state levels of cellular RNAs. Metabolic tagging of newly transcribed RNA by 4-thiouridine (4sU) can reveal the relative contributions of RNA synthesis and decay rates. The kinetics of RNA processing, however, had so far remained unresolved. Here, we show that ultrashort 4sU-tagging not only provides snapshot pictures of eukaryotic gene expression but, when combined with progressive 4sU-tagging and RNA-seq, reveals global RNA processing kinetics at nucleotide resolution. Using this method, we identified classes of rapidly and slowly spliced/degraded introns. Interestingly, each class of splicing kinetics was characterized by a distinct association with intron length, gene length, and splice site strength. For a large group of introns, we also observed long lasting retention in the primary transcript, but efficient secondary splicing or degradation at later time points. Finally, we show that processing of most, but not all small nucleolar (sno)RNA-containing introns is remarkably inefficient with the majority of introns being spliced and degraded rather than processed into mature snoRNAs. In summary, our study yields unparalleled insights into the kinetics of RNA processing and provides the tools to study molecular mechanisms of RNA processing and their contribution to the regulation of gene expression.
Project description:Erythropoietin (EPO) acts through the dimeric erythropoietin receptor to stimulate proliferation, survival, differentiation and enucleation of erythroid progenitor cells. We undertook two complimentary approaches to find EPO-dependent pSTAT5 target genes in murine erythroid cells: RNA-seq of newly transcribed (4sU-labelled) RNA, and ChIP-seq for pSTAT5 30 minutes after EPO stimulation. We found 302 pSTAT5-occupied sites: ~15% of these reside in promoters while the rest reside within intronic enhancers or intergenic regions, some >100kb from the nearest TSS. The majority of pSTAT5 peaks contain a central palindromic GAS element, TTCYXRGAA. There was significant enrichment for GATA motifs and CACCC-box motifs within the neighbourhood of pSTAT5-bound peaks, and GATA1 and/or KLF1 co-occupancy at many sites. Using 4sU-RNA-seq we determined the EPO-induced transcriptome and validated differentially expressed genes using dynamic CAGE data and qRT-PCR. We identified known direct pSTAT5 target genes such as Bcl2l1, Pim1 and Cish, and many new targets likely to be involved in driving erythroid cell differentiation including those involved in mRNA splicing (Rbm25), epigenetic regulation (Suv420h2), and EpoR turnover (Clint1/EpsinR). Some of these new EpoR-JAK2-pSTAT5 target genes could be used as biomarkers for monitoring disease activity in polycythaemia vera, and for monitoring responses to JAK inhibitors.
Project description:The extensive generation of RNA sequencing (RNA-seq) data in the last decade has resulted in a myriad of specialized software for its analysis. Each software module typically targets a specific step within the analysis pipeline, making it necessary to join several of them to get a single cohesive workflow. Multiple software programs automating this procedure have been proposed, but often lack modularity, transparency or flexibility. We present ARMOR, which performs an end-to-end RNA-seq data analysis, from raw read files, via quality checks, alignment and quantification, to differential expression testing, geneset analysis and browser-based exploration of the data. ARMOR is implemented using the Snakemake workflow management system and leverages conda environments; Bioconductor objects are generated to facilitate downstream analysis, ensuring seamless integration with many R packages. The workflow is easily implemented by cloning the GitHub repository, replacing the supplied input and reference files and editing a configuration file. Although we have selected the tools currently included in ARMOR, the setup is modular and alternative tools can be easily integrated.
Project description:Biochemical methods are available for enriching 5' ends of RNAs in prokaryotes, which are employed in the differential RNA-seq (dRNA-seq) and the more recent Cappable-seq protocols. Computational methods are needed to locate RNA 5' ends from these data by statistical analysis of the enrichment. Although statistical-based analysis methods have been developed for dRNA-seq, they may not be suitable for Cappable-seq data. The more efficient enrichment method employed in Cappable-seq compared with dRNA-seq could affect data distribution and thus algorithm performance.We present Transformation of Nucleotide Enrichment Ratios (ToNER), a tool for statistical modeling of enrichment from RNA-seq data obtained from enriched and unenriched libraries. The tool calculates nucleotide enrichment scores and determines the global transformation for fitting to the normal distribution using the Box-Cox procedure. From the transformed distribution, sites of significant enrichment are identified. To increase power of detection, meta-analysis across experimental replicates is offered. We tested the tool on Cappable-seq and dRNA-seq data for identifying Escherichia coli transcript 5' ends and compared the results with those from the TSSAR tool, which is designed for analyzing dRNA-seq data. When combining results across Cappable-seq replicates, ToNER detects more known transcript 5' ends than TSSAR. In general, the transcript 5' ends detected by ToNER but not TSSAR occur in regions which cannot be locally modeled by TSSAR.ToNER uses a simple yet robust statistical modeling approach, which can be used for detecting RNA 5'ends from Cappable-seq data, in particular when combining information from experimental replicates. The ToNER tool could potentially be applied for analyzing other RNA-seq datasets in which enrichment for other structural features of RNA is employed. The program is freely available for download at ToNER webpage (http://www4a.biotec.or.th/GI/tools/toner) and GitHub repository (https://github.com/PavitaKae/ToNER).
Project description:<h4>Summary</h4>G-OnRamp provides a user-friendly, web-based platform for collaborative, end-to-end annotation of eukaryotic genomes using UCSC Assembly Hubs and JBrowse/Apollo genome browsers with evidence tracks derived from sequence alignments, ab initio gene predictors, RNA-Seq data and repeat finders. G-OnRamp can be used to visualize large genomics datasets and to perform collaborative genome annotation projects in both research and educational settings.<h4>Availability and implementation</h4>The virtual machine images and tutorials are available on the G-OnRamp web site (http://g-onramp.org/deployments). The source code is available under an Academic Free License version 3.0 through the goeckslab GitHub repository (https://github.com/goeckslab).<h4>Supplementary information</h4>Supplementary data are available at Bioinformatics online.
Project description:The mRNA m6A reader YTHDF2 is overexpressed in a broad spectrum of human acute myeloid leukemias (AML). To study the role of YTHDF2 on mRNA decay rates in leukemia, c-Kit+ cells from foetal livers of Ythdf2fl/fl; Vav-iCre (Ythdf2CKO) and Ythdf2fl/fl (Ythdf2CTL) 14.5 dpc embryos were transduced with Meis1 and Hoxa9 oncogenes and serially re-plated to generate pre-leukemic cells. Medium with 4SU was used for pre-leukemic cells labelling for 12 hours and was later replaced with 4SU-free medium (time 0). Cells were collected immediately after medium change and at 1, 3 and 9 hours for library generation. RNA from Ythdf2CKO (n=3 biological replicates) and Ythdf2CTL (n=3 biological replicates) pre-leukemic cells were used for SLAM-seq library generation.
Project description:RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and https://hub.docker.com/r/maayanlab/zika/.
Project description:TGF-b1-stimulation induces an epithelial dedifferentiation-process, throughout which epithelial cell sheets disintegrate and gradually switch into fibroblastic-appearing cells (EMT-like transition). The purpose of these profiles was to identify differentially expressed genes that are regulated transcriptionally. Standard microarry-based gene expression profiles measure steady-state RNA but do not provide insight into underlying regulatory principles. NIAC-NTR-based gene expression profiling (Kenzelmann et al., PNAS, 2007) essentially enables the dissection of transcriptionally versus non-transcriptionally regulated genes within respective analysed time-frames. Briefly, NIAC-NTR relies on incorporation of 4sU (thio-uridine) into nascent RNA, which can subsequently be specifically isolated by custom-made columns. Total- and enriched (4sU-labeled) are then further processed for microarray gene expression profiling by standard procedures. This dataset complements previously released data of NIAC-NTR-based gene expression profiling of cells treated with TGF-b1 and 4sU for 2hrs [GSE23833]. The present data and files represent the outcome of NIAC-NTR-based gene expression profiling of cells treated with TGF-b1 for 24hrs and incubation with 4sU for the last 2hrs (22hrs-24hrs of TGF-b1-stimulation). NIAC-NTR: Non Invasive Application and Capture of Newly Transcribed RNA NMuMG cells were seeded 24hrs prior to treatment. Cells were stimulated with 5ng/ml TGF-b1 for a total of 24hrs. After 22hrs of stimulation 200M-BM-5M 4sU thio-uridine was added to the cultures and further incubated for another 2hrs. Total RNA was extracted using RNeasy Mini Kits (Qiagen). 4sU-labeled RNA was further extracted from total RNA using mercury-based custom-made columns. The experiments were performed as independent biological triplicate. Details about isolation of 4sU-labeled RNA can be found in Kenzelmann et al. PNAS, 2007.