TELP, a sensitive and versatile library construction method for next-generation sequencing
Ontology highlight
ABSTRACT: Next-generation sequencing has been widely used for the genome-wide profiling of histone modifications, transcription factor binding and gene expression through chromatin immunoprecipitated DNA sequencing (ChIP-seq) and cDNA sequencing (RNA-seq). Here, we describe a versatile library construction method that can be applied to both ChIP-seq and RNA-seq on the widely used Illumina platforms. Standard methods for ChIP-seq library construction require nanograms of starting DNA, substantially limiting its application to rare cell types or limited clinical samples. By minimizing the DNA purification steps that cause major sample loss, our method achieved a high sensitivity in ChIP-seq library preparation. Using this method, we achieved the following: (1) generated high-quality epigenomic and transcription factor-binding maps using ChIP-seq for murine adipocytes; (2) successfully prepared a ChIP-seq library from as little as 25 pg of starting DNA; (3) achieved paired-end sequencing of the ChIP-seq libraries; (4) systematically profiled gene expression dynamics during murine adipogenesis using RNA-seq; and (5) preserved the strand specificity of the transcripts in RNA-seq. Given its sensitivity and versatility in both double-stranded and single-stranded DNA library construction, this method has wide applications in genomic, epigenomic, transcriptomic and interactomic studies. Pre-adipocytes and mature adipocytes were collected. Their chromatin and RNA were subjected to ChIP and mRNA extraction. Sequencing libraries from ChIP DNA or mRNA were generated following either standard protocols or TELP method. The quality and features of TELP libraries were proved and demonstrated in comparison with standard libraries or other published data.
Project description:We assessed Smart-Seq, a new single-cell RNA-Seq library preparation method, on a variety of mouse and human RNA samples or cells. We generated RNA-Seq libraries for dilution series of MAQC reference RNA and mouse brain RNA to assess technical reproducibility, and for a variety of individual cells including putative circulating tumour cells.
Project description:Meiotic DNA double stranded breaks (DSBs) initiate genetic recombination in discrete areas of the genome called recombination hotspots. Although DSBs can be directly mapped using ChIP-Seq and antibody against ssDNA-associated proteins, genome-wide mapping of recombination hotspots in mammals is still a challenge due to the low frequency of recombination, high heterogeneity of the germ cell population and the relatively low efficiency of ChIP. To overcome these limitations we have developed a novel method, single-stranded DNA (ssDNA) sequencing (SSDS), that specifically detects protein-bound single-stranded DNA at DSB ends. SSDS consists of a computational framework for the specific detection of ssDNA-derived reads in a sequencing library and a new library preparation procedure for the enrichment of fragments originating from ssDNA. When applied to mapping meiotic DSBs, the use of SSDS reduces the non-specific dsDNA background more than ten-fold. Our method can be extended to other systems where the identification of ssDNA or DSBs is desired. Development and validation of the method, SSDS, for the specific detection of ssDNA-derived and dsDNA-derived fragments in sequencing libraries and enrichment of ssDNA-derived fragments. SSDS was used to detect meiotic DSBs in 9R/13R mice.
Project description:In this work, we used our recently developed method Spec-seq to characterize the binding specificity of Glucocorticoid receptor in vitro. This GR Spec-seq experiment has been run twice separately (Dec 2014 and Mar 2015) with different library compositions. The basic workflow is the same as our previous work for lac repressor published in Genetics 198.3 (2014): 1329-1343. Recombinant human GR protein was used to facilitate in vitro DNA-binding and separation experiments. Bound and Unbound DNA fragments were separated in EMSA gels, purified, barcoded for further Illumina sequencing. An initial experiment was performed using the putative consensus sequence: AGAACA GGG TGTTCT; randomized library 1: AGAACN NSN NGTTCT [diversity =512]; randomized library 2: AGAANN GGG NNNTCT [diversity = 1024]; randomized library 3: AGAACA GGG TGNNNN [diversity =256]; randomized library 4: AGAACA GGGC NNNTCT [diversity = 64]. The initial total library diversity was ~1853 with a library composition of 10% positive control sequence + randomized libraries 1-4 + 5% negative control sequence. A 5th library containing 2304 sequences based on DDNACW KKN KGTTCT, where D="not C", N="any base", W="A or T" and K="G or T" was subsequently prepared and analyzed using similar ratios of control sequences. Binding conditions for EMSA were 100ng FAM-labelled dsDNA+ 0/0.5/1/2/4uM GR DBD protein for each lane, 1X NEB buffer 4. GR DBD was prepared as previously described. The EMSA was performed using a 9% 33:1 acrylamide gel and TB buffer, and was run at 200V for 30mins @ 0 degrees C. The 2uM protein lane used for final sequencing. Bound/unbound fractions resulting from EMSA of these libraries and conditions were used to generate PWMs as described. The GR PWM that was generated through this analysis was used to define relative binding energies using the patser program (35), which can be accessed online at (http://stormo.wustl.edu/consensus/cgi-bin/Server/Interface/patser.cgi ). Derived binding affinities are proportional to the inverse of the natural log of the calculated energies.
Project description:Next-generation sequencing has been widely used for the genome-wide profiling of histone modifications, transcription factor binding and gene expression through chromatin immunoprecipitated DNA sequencing (ChIP-seq) and cDNA sequencing (RNA-seq). Here, we describe a versatile library construction method that can be applied to both ChIP-seq and RNA-seq on the widely used Illumina platforms. Standard methods for ChIP-seq library construction require nanograms of starting DNA, substantially limiting its application to rare cell types or limited clinical samples. By minimizing the DNA purification steps that cause major sample loss, our method achieved a high sensitivity in ChIP-seq library preparation. Using this method, we achieved the following: (1) generated high-quality epigenomic and transcription factor-binding maps using ChIP-seq for murine adipocytes; (2) successfully prepared a ChIP-seq library from as little as 25 pg of starting DNA; (3) achieved paired-end sequencing of the ChIP-seq libraries; (4) systematically profiled gene expression dynamics during murine adipogenesis using RNA-seq; and (5) preserved the strand specificity of the transcripts in RNA-seq. Given its sensitivity and versatility in both double-stranded and single-stranded DNA library construction, this method has wide applications in genomic, epigenomic, transcriptomic and interactomic studies.
Project description:Epigenomic profiling by ChIP-seq is a prevailing methodology used to investigate chromatin-based regulation in biological systems, such as human disease, yet the lack of an empirical methodology to normalize amongst experiments has limited the usefulness of this technique. Here we describe a “spike-in” normalization method that allows the quantitative comparison of histone modification status across cell populations using defined quantities of a reference epigenome. We demonstrate the utility of this method in measuring epigenomic changes following chemical perturbations and show how control normalization of ChIP-seq experiments enables discovery of disease-relevant changes in histone modification occupancy. ChIP-Seq of histone modifications H3K79me2 and H3K4me3 in human samples treated with EPZ-5676 with/without reference epigenome spike-in.
Project description:ChIP-seq and input sequence data used in the development and evaluation of the BEADS normalization method. Examination of ChIP and input sequence reads across the worm genome
Project description:High throughput sequencing is frequently used to discover the location of regulatory interactions on chromatin. However, techniques that enrich DNA where regulatory activity takes place, such as chromatin immunoprecipitation (ChIP), often yield less DNA than optimal for sequencing library preparation. Existing protocols for picogram-scale libraries require concomitant fragmentation of DNA, pre-amplification, or long overnight steps. We report a simple and fast library construction method that produces libraries from sub-nanogram quantities of DNA. This protocol yields conventional libraries with barcodes suitable for multiplexed sample analysis on the Illumina platform. We demonstrate the utility of this method by constructing a ChIP-seq library from 100 pg of ChIP DNA that demonstrates equivalent genomic coverage of target regions to a library produced from a larger scale experiment. Application of this method allows whole genome studies from samples where material or yields are limiting. Comparison of ChIP-seq libraries constructed from 100 pg DNA (this study) and nanograms of DNA (modENCODE). ChIP antibody: H3K27me3, Active Motif 31955.
Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Peggy Farnham mailto:pfarnham@usc.edu for questions concerning data collection and usage and Philip Cayting mailto:pcayting@stanford.edu for data scoring and submission inquiries). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track, produced as part of the ENCODE Project, displays maps of histone modifications genome-wide using ChIP-seq in different cell lines. The ChIP-seq method involves first using formaldehyde to cross-link histones and other DNA-associated proteins to genomic DNA within cells. The cross-linked chromatin is subsequently extracted, sheared, and immunoprecipitated using specific antibodies. After reversal of cross-links, the immunoprecipitated DNA is sequenced and mapped to the human reference genome. The relative enrichment of each antibody-target (epitope) across the genome is inferred from the density of mapped fragments. Chemical modifications (e.g. methylation or acetylation) of the histone proteins present in chromatin influence gene expression by changing how accessible the chromatin is to transcription factors. Shown for each experiment (defined as a particular antibody and a particular cell type) is a track of enrichment for the specifically modified histone (Signal), along with sites that have the greatest enrichment (Peaks). Also included for each cell type is the input signal, which represents the control condition where no antibody targeting was performed. In general the following chemical modifications have associated genetic phenotypes: H3K4me3 and H3K9Ac are considered to be marks of active or potentially active promoter regions. H3K4me1 and H3K27Ac are considered to be marks of active or potentially active enhancer regions. H3K36me3 and H3K79me2 are considered to be marks of transcriptional elongation. H3K27me3 and H3K9me3 are considered to be marks of inactive regions. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols. Briefly, cells were crosslinked, chromatin was extracted and sonicated using a Bioruptor sonicator (Diagenode) to an average size of 300-500bp, and individual ChIP assays were performed using antibodies to modified histones. For the K562 and Ntera2 histone ChIP-seq samples, immunoprecipitates were collected using protein G-coupled magnetic beads; a detailed ChIP and library protocol can be found at http://www.roadmapepigenomics.org/protocols. For the U2OS histone ChIP-seq samples, immunoprecipitates were collected using StaphA cells; a detailed protocol can be found at http://expression.genomecenter.ucdavis.edu/chip.html. Library DNA was quantitated using either a Nanodrop or a BioAnalyzer and sequenced on an Illumina GA2. The sequencing reads were mapped to the genome using the Eland alignment program. ChIP-seq data was scored based on sequence reads (length ~30 bps) that align uniquely to the human genome. From the mapped tags, a signal map of ChIP DNA fragments (average fragment length ~ 200 bp) was constructed where the signal height is the number of overlapping fragments at each nucleotide position in the genome. For each 1 Mb segment of each chromosome, a peak height threshold was determined by requiring a false discovery rate <= 0.05 when comparing the number of peaks above threshold as compared to the number obtained from multiple simulations of a random null background with the same number of mapped reads (also accounting for the fraction of mapable bases for sequence tags in that 1 Mb segment). The number of mapped tags in a putative binding region is compared to the normalized (normalized by correlating tag counts in genomic 10 kb windows) number of mapped tags in the same region from an input DNA control. Using a binomial test, only regions that have a p-value <= 0.05 are considered to be significantly enriched compared to the input DNA control.