Optimization of PAR-CLIP for transcriptome-wide identification of binding sites of RNA-binding proteins.
ABSTRACT: Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation (PAR-CLIP) in combination with next-generation sequencing is a powerful method for identifying endogenous targets of RNA-binding proteins (RBPs). Depending on the characteristics of each RBP, key steps in the PAR-CLIP procedure must be optimized. Here we present a comprehensive step-by-step PAR-CLIP protocol with detailed explanations of the critical steps. Furthermore, we report the application of a new PAR-CLIP data analysis pipeline to three distinct RBPs targeting different annotation categories of cellular RNAs.
Project description:miRNAs are short (20-23 nt) RNAs that are loaded into proteins of the Argonaute (AGO) family and guide them to partially complementary target sites on mRNAs, resulting in mRNA destabilization and/or translational repression. It is estimated that about 60% of the mammalian genes are potentially regulated by miRNAs, and therefore methods for experimental miRNA target determination have become valuable tools for the characterization of posttranscriptional gene regulation. Here we present a step-by-step protocol and guidelines for the computational analysis for the large-scale identification of miRNA target sites in cultured cells by photoactivatable ribonucleoside enhanced crosslinking and immunoprecipitation (PAR-CLIP) of AGO proteins.
Project description:BACKGROUND:Next-generation sequencing technologies have profoundly impacted biology over recent years. Experimental protocols, such as photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP), which identifies protein-RNA interactions on a genome-wide scale, commonly employ deep sequencing. With PAR-CLIP, the incorporation of photoactivatable nucleosides into nascent transcripts leads to high rates of specific nucleotide conversions during reverse transcription. So far, the specific properties of PAR-CLIP-derived sequencing reads have not been assessed in depth. METHODS:We here compared PAR-CLIP sequencing reads to regular transcriptome sequencing reads (RNA-Seq) to identify distinctive properties that are relevant for reference-based read alignment of PAR-CLIP datasets. We developed a set of freely available tools for PAR-CLIP data analysis, called the PAR-CLIP analyzer suite (PARA-suite). The PARA-suite includes error model inference, PAR-CLIP read simulation based on PAR-CLIP specific properties, a full read alignment pipeline with a modified Burrows-Wheeler Aligner algorithm and CLIP read clustering for binding site detection. RESULTS:We show that differences in the error profiles of PAR-CLIP reads relative to regular transcriptome sequencing reads (RNA-Seq) make a distinct processing advantageous. We examine the alignment accuracy of commonly applied read aligners on 10 simulated PAR-CLIP datasets using different parameter settings and identified the most accurate setup among those read aligners. We demonstrate the performance of the PARA-suite in conjunction with different binding site detection algorithms on several real PAR-CLIP and HITS-CLIP datasets. Our processing pipeline allowed the improvement of both alignment and binding site detection accuracy. AVAILABILITY:The PARA-suite toolkit and the PARA-suite aligner are available at https://github.com/akloetgen/PARA-suite and https://github.com/akloetgen/PARA-suite_aligner, respectively, under the GNU GPLv3 license.
Project description:Post-transcriptional gene regulation is robustly regulated by RNA-binding proteins (RBPs). Here we describe the collection of RNAs regulated by AUF1 (AU-binding factor 1), an RBP linked to cancer, inflammation and aging. Photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) analysis reveals that AUF1 primarily recognizes U-/GU-rich sequences in mRNAs and noncoding RNAs and influences target transcript fate in three main directions. First, AUF1 lowers the steady-state levels of numerous target RNAs, including long noncoding RNA NEAT1, in turn affecting the organization of nuclear paraspeckles. Second, AUF1 does not change the abundance of many target RNAs, but ribosome profiling reveals that AUF1 promotes the translation of numerous mRNAs in this group. Third, AUF1 unexpectedly enhances the steady-state levels of several target mRNAs encoding DNA-maintenance proteins. Through its actions on target RNAs, AUF1 preserves genomic integrity, in agreement with the AUF1-elicited prevention of premature cellular senescence.
Project description:The photoactivatable ribonucleoside enhanced cross-linking immunoprecipitation (PAR-CLIP) has been increasingly used for the global mapping of RNA-protein interaction sites. There are two key features of the PAR-CLIP experiments: The sequence read tags are likely to form an enriched peak around each RNA-protein interaction site; and the cross-linking procedure is likely to introduce a specific mutation in each sequence read tag at the interaction site. Several ad hoc methods have been developed to identify the RNA-protein interaction sites using either sequence read counts or mutation counts alone; however, rigorous statistical methods for analyzing PAR-CLIP are still lacking. In this article, we propose an integrative model to establish a joint distribution of observed read and mutation counts. To pinpoint the interaction sites at single base-pair resolution, we developed a novel modeling approach that adopts non-homogeneous hidden Markov models to incorporate the nucleotide sequence at each genomic location. Both simulation studies and data application showed that our method outperforms the ad hoc methods, and provides reliable inferences for the RNA-protein binding sites from PAR-CLIP data.
Project description:PAR-CLIP (photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation) facilitates the identification and mapping of protein/RNA interactions. So far, it has been limited to select cell-lines as it requires efficient 4SU uptake. To increase transcriptome complexity and thus identify additional RNA-protein interaction sites we fused HEK 293 T-Rex cells (HEK293-Y) that express the RNA binding protein YBX1 with PC12 cells expressing eGFP (PC12-eGFP). The resulting hybrids enable PAR-CLIP on a neuronally expanded transcriptome (Fusion-CLIP) and serve as a proof of principle. The fusion cells express both parental marker genes YBX1 and eGFP and the expanded transcriptome contains human and rat transcripts. PAR-CLIP of fused cells versus the parental HEK293-Y identified 768 novel RNA targets of YBX1. We were able to trace the origin of the majority of the short PAR-CLIP reads as they differentially mapped to the human and rat genome. Furthermore, Fusion-CLIP expanded the CAUC RNA binding motif of YBX1 to UCUUUNNCAUC. The fusion of HEK293-Y and PC12-eGFP cells resulted in cells with a diverse genome expressing human and rat transcripts that enabled the identification of novel YBX1 substrates. The technique allows the expansion of the HEK 293 transcriptome and makes PAR-CLIP available to fusion cells of diverse origin.
Project description:BACKGROUND: MicroRNAs (miRNAs) play a critical role in down-regulating gene expression. By coupling with Argonaute family proteins, miRNAs bind to target sites on mRNAs and employ translational repression. A large amount of miRNA-target interactions (MTIs) have been identified by the crosslinking and immunoprecipitation (CLIP) and the photoactivatable-ribonucleoside-enhanced CLIP (PAR-CLIP) along with the next-generation sequencing (NGS). PAR-CLIP shows high efficiency of RNA co-immunoprecipitation, but it also lead to T to C conversion in miRNA-RNA-protein crosslinking regions. This artificial error obviously reduces the mappability of reads. However, a specific tool to analyze CLIP and PAR-CLIP data that takes T to C conversion into account is still in need. RESULTS: We herein propose the first CLIP and PAR-CLIP sequencing analysis platform specifically for miRNA target analysis, namely miRTarCLIP. From scratch, it automatically removes adaptor sequences from raw reads, filters low quality reads, reverts C to T, aligns reads to 3'UTRs, scans for read clusters, identifies high confidence miRNA target sites, and provides annotations from external databases. With multi-threading techniques and our novel C to T reversion procedure, miRTarCLIP greatly reduces the running time comparing to conventional approaches. In addition, miRTarCLIP serves with a web-based interface to provide better user experiences in browsing and searching targets of interested miRNAs. To demonstrate the superior functionality of miRTarCLIP, we applied miRTarCLIP to two public available CLIP and PAR-CLIP sequencing datasets. miRTarCLIP not only shows comparable results to that of other existing tools in a much faster speed, but also reveals interesting features among these putative target sites. Specifically, we used miRTarCLIP to disclose that T to C conversion within position 1-7 and that within position 8-14 of miRNA target sites are significantly different (p value = 0.02), and even more significant when focusing on sites targeted by top 102 highly expressed miRNAs only (p value = 0.01). These results comply with previous findings and further suggest that combining miRNA expression and PAR-CLIP data can improve accuracy of the miRNA target prediction. CONCLUSION: To sum up, we devised a systematic approach for mining miRNA-target sites from CLIP-seq and PAR-CLIP sequencing data, and integrated the workflow with a graphical web-based browser, which provides a user friendly interface and detailed annotations of MTIs. We also showed through real-life examples that miRTarCLIP is a powerful tool for understanding miRNAs. Our integrated tool can be accessed online freely at http://miRTarCLIP.mbc.nctu.edu.tw.
Project description:Crosslinking and immunoprecipitation (CLIP) protocols have made it possible to identify transcriptome-wide RNA-protein interaction sites. In particular, PAR-CLIP utilizes a photoactivatable nucleoside for more efficient crosslinking. We present an approach, centered on the novel PARalyzer tool, for mapping high-confidence sites from PAR-CLIP deep-sequencing data. We show that PARalyzer delineates sites with a high signal-to-noise ratio. Motif finding identifies the sequence preferences of RNA-binding proteins, as well as seed-matches for highly expressed microRNAs when profiling Argonaute proteins. Our study describes tailored analytical methods and provides guidelines for future efforts to utilize high-throughput sequencing in RNA biology. PARalyzer is available at http://www.genome.duke.edu/labs/ohler/research/PARalyzer/.
Project description:Human LIN28A and B are RNA-binding proteins (RBPs) conserved in animals with important roles during development and stem cell reprogramming. We used Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation (PAR-CLIP) in HEK293 cells and identified a largely overlapping set of ~3,000 mRNAs at ~9,500 sites located in the 3’UTR and CDS. In vitro and in vivo, LIN28 preferentially bound single-stranded RNA containing a uridine-rich element and one or more flanking guanosines, and appeared to be able to disrupt base-pairing to access these elements when embedded in predicted secondary structure. In HEK293 cells, LIN28 protein binding mildly stabilized target mRNAs and increased protein abundance. The top targets were its own mRNAs and those of other RBPs and cell-cycle regulators. Alteration of LIN28 protein levels also negatively regulated the abundance of some, but not all let-7 miRNA family members, indicating sequence-specific binding of let-7 precursors to LIN28 proteins and competition with cytoplasmic miRNA biogenesis factors. LIN28 protein PAR-CLIP
Project description:CNBP is a eukaryote-conserved nucleic-acid binding protein required in mammals for embryonic development. It contains seven CCHC-type zinc-finger domains and was suggested to act as a nucleic acid chaperone, as well as a transcription factor. Here, we identify all CNBP isoforms as cytoplasmic messenger RNA (mRNA)-binding proteins. Using Photoactivatable Ribonucleoside Enhanced Cross-linking and Immunoprecipitation, we mapped its binding sites on RNA at nucleotide-level resolution on a genome-wide scale and find that CNBP interacted with 3961 mRNAs in human cell lines, preferentially at a G-rich motif close to the AUG start codon on mature mRNAs. Loss- and gain-of-function analyses coupled with system-wide RNA and protein quantification revealed that CNBP did not affect RNA abundance, but rather promoted translation of its targets. This is consistent with an RNA chaperone function of CNBP helping to resolve secondary structures, thus promoting translation. CNBP PAR-CLIP
Project description:BACKGROUND:Various microRNAs (miRNAs) are up- or downregulated in tumors. However, the repression of cognate miRNA targets responsible for the phenotypic effects of this dysregulation in patients remains largely unexplored. To define miRNA targets and associated pathways, together with their relationship to outcome in breast cancer, we integrated patient-paired miRNA-mRNA expression data with a set of validated miRNA targets and pathway inference. RESULTS:To generate a biochemically-validated set of miRNA-binding sites, we performed argonaute-2 photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation (AGO2-PAR-CLIP) in MCF7 cells. We then defined putative miRNA-target interactions using a computational model, which ranked and selected additional TargetScan-predicted interactions based on features of our AGO2-PAR-CLIP binding-site data. We subselected modeled interactions according to the abundance of their constituent miRNA and mRNA transcripts in tumors, and we took advantage of the variability of miRNA expression within molecular subtypes to detect miRNA repression. Interestingly, our data suggest that miRNA families control subtype-specific pathways; for example, miR-17, miR-19a, miR-25, and miR-200b show high miRNA regulatory activity in the triple-negative, basal-like subtype, whereas miR-22 and miR-24 do so in the HER2 subtype. An independent dataset validated our findings for miR-17 and miR-25, and showed a correlation between the expression levels of miR-182 targets and overall patient survival. Pathway analysis associated miR-17, miR-19a, and miR-200b with leukocyte transendothelial migration. CONCLUSIONS:We combined PAR-CLIP data with patient expression data to predict regulatory miRNAs, revealing potential therapeutic targets and prognostic markers in breast cancer.