A Small RNA Isolation and Sequencing Protocol and Its Application to Assay CRISPR RNA Biogenesis in Bacteria.
ABSTRACT: Next generation high-throughput sequencing has enabled sensitive and unambiguous analysis of RNA populations in cells. Here, we describe a method for isolation and strand-specific sequencing of small RNA pools from bacteria that can be multiplexed to accommodate multiple biological samples in a single experiment. Small RNAs are isolated by polyacrylamide gel electrophoresis and treated with T4 polynucleotide kinase. This allows for 3' adapter ligation to CRISPR RNAs, which don't have pre-existing 3'-OH ends. Pre-adenylated adapters are then ligated using T4 RNA ligase 1 in the absence of ATP and with a high concentration of polyethylene glycol (PEG). The 3' capture step enables precise determination of the 3' ends of diverse RNA molecules. Additionally, a random hexamer in the ligated adapter helps control for potential downstream amplification bias. Following reverse-transcription, the cDNA product is circularized and libraries are prepared by PCR. We show that the amplified library need not be visible by gel electrophoresis for efficient sequencing of the desired product. Using this method, we routinely prepare RNA sequencing libraries from minute amounts of purified small RNA. This protocol is tailored to assay for CRISPR RNA biogenesis in bacteria through sequencing of mature CRISPR RNAs, but can be used to sequence diverse classes of small RNAs. We also provide a fully worked example of our data processing pipeline, with instructions for running the provided scripts.
Project description:T4 RNA ligases are commonly used to attach adapters to RNAs, but large differences in ligation efficiency make detection and quantitation problematic. We developed a ligation selection strategy using random RNAs in combination with high-throughput sequencing to gain insight into the differences in efficiency of ligating pre-adenylated DNA adapters to RNA 3'-ends. After analyzing biases in RNA sequence, secondary structure and RNA-adapter cofold structure, we conclude that T4 RNA ligases do not show significant primary sequence preference in RNA substrates, but are biased against structural features within RNAs and adapters. Specifically, RNAs with less than three unstructured nucleotides at the 3'-end and RNAs that are predicted to cofold with an adapter in unfavorable structures are likely to be poorly ligated. The effect of RNA-adapter cofold structures on ligation is supported by experiments where the ligation efficiency of specific miRNAs was changed by designing adapters to alter cofold structure. In addition, we show that using adapters with randomized regions results in higher ligation efficiency and reduced ligation bias. We propose that using randomized adapters may improve RNA representation in experiments that include a 3'-adapter ligation step.
Project description:The preparation of small RNA cDNA sequencing libraries depends on the unbiased ligation of adapters to the RNA ends. Small RNA with 5' recessed ends are poor substrates for enzymatic adapter ligation, but this 5' adapter ligation problem can go undetected if the library preparation steps are not monitored. Here we illustrate the severity of the 5' RNA end ligation problem using several pre-miRNA-like hairpins that allow us to expand the definition of the problem to include 5' ends close to a hairpin stem, whether recessed or in a short extension. The ribosome profiling method can avoid a difficult 5' adapter ligation, but the enzyme typically used to circularize the cDNA has been reported to be biased, calling into question the benefit of this workaround. Using the TS2126 RNA ligase 1 (a.k.a. CircLigase) as the circularizing enzyme, we devised a bias test for the circularization of first strand cDNA. All possible dinucleotides were circle-ligated with similar efficiency. To re-linearize the first strand cDNA in the ribosome profiling approach, we introduce an improved method wherein a single ribonucleotide is placed between the sequencing primer binding sites in the reverse transcriptase primer, which later serves as the point of re-linearization by RNase A. We incorporate this step into the ribosomal profiling method and describe a complete improved library preparation method, Coligo-seq, for the sequencing of small RNA with secondary structure close to the 5' end. This method accepts a variety of 5' modified RNA, including 5' monophosphorylated RNA, as demonstrated by the construction of a HeLa cell microRNA cDNA library.
Project description:Nucleic acid ligases are crucial enzymes that repair breaks in DNA or RNA during synthesis, repair and recombination. Various genomic tools have been developed using the diverse activities of DNA/RNA ligases. Herein, we demonstrate a non-conventional ability of T4 DNA ligase to insert 5' phosphorylated blunt-end double-stranded DNA to DNA breaks at 3'-recessive ends, gaps, or nicks to form a Y-shaped 3'-branch structure. Therefore, this base pairing-independent ligation is termed 3'-branch ligation (3'BL). In an extensive study of optimal ligation conditions, the presence of 10% PEG-8000 in the ligation buffer significantly increased ligation efficiency to more than 80%. Ligation efficiency was slightly varied between different donor and acceptor sequences. More interestingly, we discovered that T4 DNA ligase efficiently ligated DNA to the 3'-recessed end of RNA, not to that of DNA, in a DNA/RNA hybrid, suggesting a ternary complex formation preference of T4 DNA ligase. These novel properties of T4 DNA ligase can be utilized as a broad molecular technique in many important genomic applications, such as 3'-end labelling by adding a universal sequence; directional tagmentation for NGS library construction that achieve theoretical 100% template usage; and targeted RNA NGS libraries with mitigated structure-based bias and adapter dimer problems.
Project description:Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes.
Project description:Recent advances in next-generation sequencing technologies have revealed that cellular functional RNAs are not always expressed as single entities with fixed terminal sequences but as multiple isoforms bearing complex heterogeneity in both length and terminal sequences, such as isomiRs, the isoforms of microRNAs. Unraveling the biogenesis and biological significance of heterogenetic RNA expression requires distinctive analysis of each RNA variant. Here, we report the development of dumbbell PCR (Db-PCR), an efficient and convenient method to distinctively quantify a specific individual small RNA variant. In Db-PCR, 5'- and 3'-stem-loop adapters are specifically hybridized and ligated to the 5'- and 3'-ends of target RNAs, respectively, by T4 RNA ligase 2 (Rnl2). The resultant ligation products with 'dumbbell-like' structures are subsequently quantified by TaqMan RT-PCR. We confirmed that high specificity of Rnl2 ligation and TaqMan RT-PCR toward target RNAs assured both 5'- and 3'-terminal sequences of target RNAs with single nucleotide resolution so that Db-PCR specifically detected target RNAs but not their corresponding terminal variants. Db-PCR had broad applicability for the quantification of various small RNAs in different cell types, and the results were consistent with those from other quantification method. Therefore, Db-PCR provides a much-needed simple method for analyzing RNA terminal heterogeneity.
Project description:Total RNA was extracted from morpholically abnormal and sibling wild type embryos identified by the Zebrafish Mutation Project (http://www.sanger.ac.uk/Projects/D_rerio/zmp/). The 3 end of fragmented RNA was pulled down using polyToligos attached to magnetic beads, reverse transcribed, made into Illumina libraries and sequenced using IlluminaHiSeq paired-end sequencing. Protocol: Total RNA was extracted from mouse embryos using Trizol and DNase treated. Chemically fragmented RNA was enriched for the 3 ends by pulled down using an anchored polyToligo attached to magnetic beads. An RNA oligo comprising part of the Illumina adapter 2 was ligated to the 5 end of the captured RNA and the RNA was eluted from the beads. Reverse transcription was primed with an anchored polyToligo with part of Illumina adapter 1 at the 5 end followed by 4 random bases, then an A, C or G base, then one of twelve5 base indexing tags and 14 T bases. An Illumina library with full adapter sequence was produced by 15 cycles of PCR. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
Project description:RNA-seq was performed in biological replicates (2 mice per group) of wild-type and lincRNA-EPS-/- BMDMs at 0, 2, and 6 hours after LPS treatment (100 ng/ml). RNA-seq libraries were prepared as described (Heyer et al., 2015). Briefly, ribosomal RNA (rRNA) was depleted using the Ribozero rRNA removal kit (Epicentre). Purified mRNAs were fragmented with RNA fragmentation reagent (Life Technologies) for 4 minutes and 30 seconds at 70C to obtain 100-150 nt long RNA fragments. After ethanol precipitation and washing, RNAs were re-suspended in 5 l of water and the 3 ends dephosphorylated using PNK (New England BioLabs) for 1 hr at 37C. RNA fragments with a 3 OH were then ligated to a preadenylated DNA adaptor using T4 RNA ligase 2, truncated K227Q (NEB). Following this, ligated RNAs were reverse-transcribed with Superscript III (Invitrogen) using a bar-coded reverse-transcription primer that anneals to the preadenylated adaptor. After reverse-transcription, gel purified cDNAs were circularized using CircLigase I (Epicentre) and PCR amplified using paired-end primers PE1.0 and PE2.0 (Illumina) for 14 cycles. PCR amplicons were gel purified and submitted for sequencing on the Illumina Hiseq2000. Tophat version 2.0.12 was used to align single sequence reads to version 73 of the Ensembl mouse genome (mm10) with options: --library-type fr-firststrand -g 1 -x 1 --read-mismatches 2. Cufflinks version 2.1.1 was used to estimate RNA abundances with Ensembl version 73 GTF and the --library-type fr-firststrand option. The Cuffdiff program was used to perform differential expression analysis between wild-type BMDMs and lincRNA-EPS-/- BMDMs at each corresponding time (0,2, and 6 hours after LPS treatment). To compute fold-change values for genes/transcripts with FPKM values of zero, a pseudocount of 1 was added to all FPKM values first.
Project description:A procedure is described for mapping the ends of RNAs. Using T4 RNA ligase, a DNA (3' end) or RNA (5' end) oligonucleotide is ligated to RNA ends followed by cDNA synthesis, PCR amplification, cloning and sequencing. This method determines 5' ends, 3' polyadenylation sites and the size of poly(A) tails, and should be applicable to non-polyadenylated mRNAs and to non-message RNAs. Analysis of four Tetrahymena thermophila histone mRNAs revealed multiple, closely spaced 5' ends consistent with those determined by other methods. Except for a 'CCAAT' box in either orientation 100-200 nucleotides upstream of the transcription start site, no conserved sequence elements were observed in the untranslated 5' region or in sequences immediately flanking the transcription start site. Analysis of the 3' ends of mRNAs encoding four histones, two tubulins and the Tetrahymena TATA binding protein confirmed the observations that Tetrahymena histone messages are polyadenylated and that poly(A) tails in this organism are short (approximately 50 nt). No canonical poly(A) addition signal was identified. The four histone messages analyzed have contained three sequence elements, TGTGT-TAA-AAGTATT, not found in non-histone messages. Two non-histone messages contained GCATT(N)15ATACC near the poly(A) addition site.
Project description:Adult sibling homozygous wild type and tcf7l1a mutants were paired and Total RNA was trizol extracted from wild type and maternal zygotic homozygous tcf7l1a mutant embryos at 90% epiboly stage (9 hours post fertilization). ZFIN Identifier ZDB-GENE-980605-30, PMID:11057671. The 3 end of fragmented RNA was pulled down using polyToligos attached to magnetic beads, reverse transcribed, made into Illumina libraries and sequenced using Illumina HiSeq paired-end sequencing. Protocol: Total RNA was extracted from zebrafish embryos using Trizol and DNase treated. Chemically fragmented RNA was enriched for the 3 ends by pulled down using an anchored polyT oligo attached to magnetic beads. An RNA oligo comprising part of the Illumina adapter 2 was ligated to the 5 end of the captured RNA and the RNA was eluted from the beads. Reverse transcription was primed with an anchored polyT oligo with part of Illumina adapter 1 at the 5 end followed by 12 random bases, then an 8 base indexing tags, then CG and 14 T bases. An Illumina library with full adapter sequence was produced by 20 cycles of PCR. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
Project description:Recent advances in sequencing technology have helped unveil the unexpected complexity and diversity of small RNAs. A critical step in small RNA library preparation for sequencing is the ligation of adapter sequences to both the 5' and 3' ends of small RNAs. Studies have shown that adapter ligation introduces a significant but widely unappreciated bias in the results of high-throughput small RNA sequencing. We show that due to this bias the two widely used Illumina library preparation protocols produce strikingly different microRNA (miRNA) expression profiles in the same batch of cells. There are 102 highly expressed miRNAs that are >5-fold differentially detected and some miRNAs, such as miR-24-3p, are over 30-fold differentially detected. While some level of bias in library preparation is not surprising, the apparent massive differential bias between these two widely used adapter sets is not well appreciated. In an attempt to mitigate this bias, the new Bioo Scientific NEXTflex V2 protocol utilizes a pool of adapters with random nucleotides at the ligation boundary. We show that this protocol is able to detect robustly several miRNAs that evade capture by the Illumina-based methods. While these analyses do not indicate a definitive gold standard for small RNA library preparation, the results of the NEXTflex protocol do correlate best with RT-qPCR. As increasingly more laboratories seek to study small RNAs, researchers should be aware of the extent to which the results may differ with different protocols, and should make an informed decision about the protocol that best fits their study.