Project description:Experimental methods for discovering RNA Binding Protein (RBP) binding sites on target RNAs have recently emerged which employ fusions of RBPs to RNA-editing enzymes (such asAPOBEC1 or ADAR) to “label” mRNA. However, off-target editing, genetic variants and sequencing errors can lead to false positives when using data derived from such approaches, and highlight a need for a robust, statistical approach to prioritizing confident binding sites.
Project description:High-resolution methods such as 4C and Capture-C enable the study of chromatin loops such as those formed between promoters and enhancers or CTCF/cohesin binding sites. An important aspect of 4C/CapC analyses is the identification of robust peaks in the data for the identification of chromatin loops. Here we present an R package for the analysis of 4C/CapC data. We generated 4C data for 10 viewpoints in 2 tissues in triplicate to test our methods. We developed a non-parametric peak caller based on rank-products. Sampling analysis shows that not read depth but template quality is the most important determinant of success in 4C experiments. By performing peak calling on single experiments we show that the peak calling results are similar to the replicate experiments, but that false positive rates are significantly reduced by performing replicates.
Project description:Cleavage Under Targets and Release Using Nuclease (CUT&RUN) has rapidly gained prominence as an effective approach for mapping protein-DNA interactions, especially histone modifications, offering substantial improvements over conventional chromatin immunoprecipitation sequencing (ChIP-seq). However, the effectiveness of this technique is contingent upon accurate peak identification, necessitating the use of optimal peak calling methods tailored to the unique characteristics of CUT&RUN data. Here, we benchmark four prominent peak calling tools - MACS2, SEACR, GoPeaks, and LanceOtron - evaluating their performance in identifying peaks from CUT&RUN datasets. Our analysis utilizes in-house data of three histone marks (H3K4me3, H3K27ac, and H3K27me3) from mouse brain tissue, as well as samples from the 4DNucleome database. We systematically assess these tools based on parameters such as the number of peaks called, peak length distribution, signal enrichment, and reproducibility across biological replicates. Our findings reveal substantial variability in peak calling efficacy, with each method demonstrating distinct strengths in sensitivity, precision, and applicability depending on the histone mark in question. These insights provide a comprehensive evaluation that will assist in selecting the most suitable peak caller for high-confidence identification of regions of interest in CUT&RUN experiments, ultimately enhancing the study of chromatin dynamics and transcriptional regulation.
Project description:BackgroundFusion of RNA-binding proteins (RBPs) to RNA base-editing enzymes (such as APOBEC1 or ADAR) has emerged as a powerful tool for the discovery of RBP binding sites. However, current methods that analyze sequencing data from RNA-base editing experiments are vulnerable to false positives due to off-target editing, genetic variation and sequencing errors.ResultsWe present FLagging Areas of RNA-editing Enrichment (FLARE), a Snakemake-based pipeline that builds on the outputs of the SAILOR edit site discovery tool to identify regions statistically enriched for RNA editing. FLARE can be configured to analyze any type of RNA editing, including C to U and A to I. We applied FLARE to C-to-U editing data from a RBFOX2-APOBEC1 STAMP experiment, to show that our approach attains high specificity for detecting RBFOX2 binding sites. We also applied FLARE to detect regions of exogenously introduced as well as endogenous A-to-I editing.ConclusionsFLARE is a fast and flexible workflow that identifies significantly edited regions from RNA-seq data. The FLARE codebase is available at https://github.com/YeoLab/FLARE .
Project description:We developed a Transcriptomic Analysis Pipeline (TAP) as a flexible workflow for comprehensive transcriptome analysis from any species with a reference genome. We tested TAP in a case study to compare polyA+ and rRNA-depletion RNA-seq library protocols using Drosophila melanogaster following different thermal stress temperatures. TAP provides a flexible and complete pipeline to enable researchers to extract more biologically relevant interpretations by integrated and interactive transcriptome analysis.