DSBCapture: in situ capture and direct sequencing of dsDNA breaks
ABSTRACT: Double-strand DNA breaks (DSBs) continuously arise and are a source of mutations and chromosomal rearrangements. Here, we present DSBCapture, a sequencing-based method that captures DSBs in situ and directly maps these at single nucleotide resolution enabling the study of DSB origin. DSBCapture shows substantially increased sensitivity and data yield compared to other methods. Employing DSBCapture, we uncovered a striking relationship between DSBs and elevated transcription within nucleosome-depleted chromatin. 6 library samples, 75 base pairs (50 bp for the EcoRV library) custom protocol (DSBCapture or BLESS) sequenced as paired-end reads on Illumina NextSeq 500 (MiSeq for EcoRV library): 1 replicate for the EcoRV library, 1 replicate for the library coming from the U2OS AID-DlvA cell line with AsiSI restriction enzyme, 2 replicates for the BREAk-seq NHEK libraries and 2 replicates for the BLESS NHEK libraries. 4 RNA-Seq library samples from HEK Gibco cells, single-end sequencing on the Illumina NextSeq 500, 75 base pairs.
Project description:Over the last decade, the number of viral genome sequences deposited in available databases has grown exponentially. However, sequencing methodology vary widely and many published works have relied on viral enrichment by viral culture or nucleic acid amplification with specific primers rather than through unbiased techniques such as metagenomics. The genome of RNA viruses is highly variable and these enrichment methodologies may be difficult to achieve or may bias the results. In order to obtain genomic sequences of human respiratory syncytial virus (HRSV) from positive nasopharyngeal aspirates diverse methodologies were evaluated and compared. A total of 29 nearly complete and complete viral genomes were obtained. The best performance was achieved with a DNase I treatment to the RNA directly extracted from the nasopharyngeal aspirate (NPA), sequence-independent single-primer amplification (SISPA) and library preparation performed with Nextera XT DNA Library Prep Kit with manual normalization. An average of 633,789 and 1,674,845 filtered reads per library were obtained with MiSeq and NextSeq 500 platforms, respectively. The higher output of NextSeq 500 was accompanied by the increasing of duplicated reads percentage generated during SISPA (from an average of 1.5% duplicated viral reads in MiSeq to an average of 74% in NextSeq 500). HRSV genome recovery was not affected by the presence or absence of duplicated reads but the computational demand during the analysis was increased. Considering that only samples with viral load ? E+06 copies/ml NPA were tested, no correlation between sample viral loads and number of total filtered reads was observed, nor with the mapped viral reads. The HRSV genomes showed a mean coverage of 98.46% with the best methodology. In addition, genomes of human metapneumovirus (HMPV), human rhinovirus (HRV) and human parainfluenza virus types 1-3 (HPIV1-3) were also obtained with the selected optimal methodology.
Project description:Guaranteeing high-quality next-generation sequencing data in a rapidly changing environment is an ongoing challenge. The introduction of the Illumina NextSeq 500 and the depreciation of specific metrics from Illumina's Sequencing Analysis Viewer (SAV; Illumina, San Diego, CA, USA) have made it more difficult to determine directly the baseline error rate of sequencing runs. To improve our ability to measure base quality, we have created an open-source tool to construct the Percent Perfect Reads (PPR) plot, previously provided by the Illumina sequencers. The PPR program is compatible with HiSeq 2000/2500, MiSeq, and NextSeq 500 instruments and provides an alternative to Illumina's quality value (Q) scores for determining run quality. Whereas Q scores are representative of run quality, they are often overestimated and are sourced from different look-up tables for each platform. The PPR's unique capabilities as a cross-instrument comparison device, as a troubleshooting tool, and as a tool for monitoring instrument performance can provide an increase in clarity over SAV metrics that is often crucial for maintaining instrument health. These capabilities are highlighted.
Project description:Splenocytes, peripheral blood cells, peritoneal cells and tumor cells were washed with PBS and were counted. Cells were centrifuged at 400 g for 5 min and supernatant was removed. Cells pellet (< 3 x 106 cells) was resuspend in 350 µl of RLT buffer (Qiagen). RNA was extracted with RNeasy Mini kit (Qiagen) according to manufacturer’s protocol. mRNA library preparation was realized following manufacturer’s recommendations (KAPA mRNA HyperPrep ROCHE). Library purity/integrity were assessed using an Agilent 2200 Tapestation (Agilent Technologies, Waldbrunn, Germany). Final 7 samples pooled library prep were sequenced on Nextseq 500 ILLUMINA with MidOutPut cartridge (2x130Millions of 75 bases reads), corresponding to 2x18Millions of reads per sample after demultiplexing.
Project description:Microorganisms are useful environmental indicators, able to deliver essential insights to processes regarding mine land rehabilitation. To compare microbial communities from a chronosequence of mine land rehabilitation to pre-disturbance levels from references sites covered by native vegetation, we sampled non-rehabilitated, rehabilitating and reference study sites from the Urucum Massif, Southwestern Brazil. From each study site, three composed soil samples were collected for chemical, physical, and metagenomics analysis. We used a paired-end library sequencing technology (NextSeq 500 Illumina); the reads were assembled using MEGAHIT. Coding DNA sequences (CDS) were identified using Kaiju in combination with non-redundant NCBI BLAST reference sequences containing archaea, bacteria, and viruses. Additionally, a functional classification was performed by EMG v2.3.2. Here, we provide the raw data and assembly (reads and contigs), followed by initial functional and taxonomic analysis, as a base-line for further studies of this kind. Further investigation is needed to fully understand the mechanisms of environmental rehabilitation in tropical regions, inspiring further researchers to explore this collection for hypothesis testing.
Project description:We identified spatially restricted transcription factors and found SOX15 expression confined to stratified esophageal epithelium, with attenuation in Barrett's esophagus. SOX15 binds esophagus-specific loci and its loss in human esophageal cells affected esophagus-specific transcripts [RNA-Seq] Total RNA isolated from CPA control cells and CPA cells following SOX15 depletion, samples were prepared for sequencing using the TruSeq RNA Sample Preparation Kit (Illumina) according to the manufacturer's instructions. 75 base pair single-end reads were sequenced on an Illumina NextSeq 500 instrument. The data include 2 independent biological replicates per genotype. [ChIP-Seq] Examine SOX15-chromatin binding in CPA cells.
Project description:Our study provides detailed analysis of DF-induced transcriptomes generated by RNA-Seq technology. RNA-Seq based transcriptome analyzes would elucidate complex molecular mechanisms of multi-herbal composition like DF Overall design: mRNA profiles of control 3T3-L1 adipocytes and DF-treated 3T3-L1 adipocytes were analyzed by RNA-Seq. Construction of library was performed using SENSE 3’ mRNA-Seq Library Prep Kit (Lexogen, Inc., Austria), and high-throughput sequencing was performed as single-end 75 sequencing using NextSeq 500 (Illumina, Inc. San Diego, CA, USA)
Project description:Total RNA was extracted using RNAble (Eurobio), then cleaned-up with RNeasy columns (Qiagen), then sequenced. The libraries were prepared following the TruSeq Stranded mRNA protocol (Illumina), starting from 1 μg of high quality total RNA. Paired end (2 × 75 bp) sequencing was performed on an Illumina Nextseq 500 platform(Illumina).
Project description:To determine whether the intestine-restricted transcription factor (TF) CDX2 functionally interacts with the endoderm-wide TF HNF4A, we crossed tissue-specific conditional Cdx2 and Hnf4a knockout mice to generate compound mutant mice. We used RNA-sequencing to profile gene expression changes in compound mutant mice compared to control mice. The compound mutant mice had a significantly worse phenotype than either single mutant, and gene expression was significantly perturbed in compound mutants compared to control mice. Total RNA isolated from control and compound mutant (Hnf4a-del;Cdx2-del) jejunal mouse intestinal epithelium was prepared for sequencing using the TruSeq RNA Sample Preparation Kit (Illumina) according to the manufacturer's instructions. 75-base-pair single-end reads were sequenced on an Illumina NextSeq 500 instrument. The data include 2 independent biological replicates per genotype.
Project description:Three libraries from 100 HEK293 cells each were prepared using a Smartseq based custom library preparation approach with unique molecular identifiers. Libraries were sequenced on a Illumina NextSeq 500 HEK293 cell (100 cells) 5' selective RNAseq profiling, N4H4 unique molecular identifiers, 3 replicates
Project description:Poly(A) tails at the 3' end of eukaryotic messenger RNAs control mRNA stability and translation efficiency. Facilitated by various NGS methods, alternative polyadenylation sites determining the 3'-UTR length of gene transcripts have been extensively studied. However, poly(A) lengths demonstrating dynamic and developmental regulation remain largely unexplored. The recently developed NGS-based methods for genome-wide poly(A) profiling have promoted the study of genom-wide poly(A) dynamics. Here we present a straight forward NGS-method for poly(A) profiling, which applies a direct 3'-end adaptor ligation and the template switching for 5'-end adaptor ligation for cDNA library construction. Poly(A) lengths are directly calculated from base call data using a self-developed pipeline pA-finder. The libraries were directly sequenced from the 3'-UTR regions into the followed poly(A) tails, firstly on NextSeq 500 to produce single-end 300-nt reads, demonstrating the method feasibility and that optimization of the fragmented RNA size for cDNA library construction could detecting longer poly (A) tails. We next applied Poly(A)-seq cDNA libraries containing 40-nt and 120-nt poly(A) tail spike-in RNAs on HiSeq X-ten and NovaSeq 6000 to obtain 150-nt and 250-nt pair-end reads. The sequencing profiles of the spike-in RNAs demonstrated both high accuracy and high quality score in reading poly(A) tails. The poly(A) signal bleeding into the 3' adaptor sequence and a sharp decreased quality score at the junction were observed, allowing the modification of pA-finder to remove homopolymeric signal bleeding. We hope that wide applications of Poly(A)-seq help facilitate the study of the development- and disease-related poly(A) dynamics and regulation, and of the recent emerging mixed tailing regulation.