Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Kevin White mailto:kpwhite@uchicago.edu (Principal Investigator), Subhradip Karmakar mailto:subhradip@uchicago.edu (Project Lead), Nick Bild mailto:nbild@bsd.uchicago.edu (Data Analyst), Alina Choudhury mailto:achoudhury@uchicago.edu (Laboratory Technician), Marc Domanus mailto:mdomanus@anl.gov (Sequencing Technician at Argonne National Lab)). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This ENCODE track maps human transcription factor binding sites, genome-wide using second generation massively parallel sequencing. This mapping uses expressed transcription factors as GFP tagged fusion proteins after BAC (Bacterial artificial chromosomes) recombineering. The U. of Chicago and Max Planck Institute (Dresden) pipeline generates recombineered (recombination-mediated genetic engineering) BACs for the production of cell lines or animals that express fusion proteins from epitope tagged transgenes. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols. (http://hgwdev/ENCODE/protocols/cell) Recombineering strategy: To facilitate high-throughput production of the transgenic constructs, the program BACFinder (Crowe, Rana et al. 2002) automatically selects the most suitable BAC clone for any given human gene and generates the sets of PCR primers required for tagging and verification (Poser, Sarov et al. 2008). Recombineering is used for tagging cassettes at either the N or C terminus of the protein. The N-terminal cassette has a dual eukaryotic-prokaryotic promoter (PGK-gb2) driving a neomycin-kanamycin resistance gene within an artificial intron inside the tag coding sequence. The selection cassette is flanked by two loxP sites and can be permanently removed by Cre recombinase-mediated excision. The C-terminal cassette contains the sequence encoding the tag followed by an internal ribosome entry site (IRES) in front of the neomycin resistance gene. In addition, a short bacterial promoter (Gb3) drives the expression of the neomycin-kanamycin resistance gene in E. coli. The tagging cassettes, containing 50 nucleotides of PCR-introduced homology arms are inserted into the BAC by recombineering, either behind the start codon (for the N-terminal tag) or in front of the stop codon (for the C-terminal tag) of the gene. E. coli cells that have successfully recombined the cassette are selected for kanamycin resistance in liquid culture. Each saturated culture from a specific recombineering reaction derived 10-200 independent recombination events. Checking two independent clones for each PCR through the tag insertion point, 97% (85/88) yielded a PCR product of the expected size. Most of the clones that failed to grow were missing the targeted genomic region. An estimated 10% of the BACs used are chimeric, rearranged or wrongly mapped. Thus, initial results indicate that the necessary recombineering steps can be carried out with high fidelity. The White lab produced all epitope tagged transcription and chromatin factor BACs, as well as the genome wide ChIP data and analysis. An application of this approach to the analysis of closely related paralogs (RARa and RARg) yielded transcription factors, chromatin factors, cell lines, ChIP chip data and ChIP-seq data (Hua, Kittler et al. Cell 2009). Such paralogous transcription factors often can not otherwise be distinguished by antibodies. Sample Preparation: ChIP DNA from samples are sheared to ~800bp using a nebulizer. The ends of the DNA are polished, and two unique adapters are ligated to the fragments. Ligated fragments of 150-200bp are isolated by gel extraction and amplified using limited cycles of PCR. Sequencing System: Illumina GAIIx and HySeq next-generation sequencing produced all ChIP-seq data. Processing and Analysis Software: Raw sequencing reads are aligned using Bowtie version 0.12.5 (Langmead et al. 2009). The "-m 1" parameter is applied to suppress alignments mapping more than once in the genome. Reads are aligned to the UCSC hg19 assembly. Wiggle format signal files are generated with SPP 2.7.1 for R 2.7.1. Macs 1.3.7 is used to call peaks. The Macs parameters used vary by experiment. The White lab used goat anti-GFP antibody to perform ChIP in untagged K562 cells as a background control. The test IP was performed in the same way as the background control. Results are expressed as values of the test normalized to the background

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Richard Sandstrom mailto:sull@u.washington.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track is produced as part of the ENCODE Project. This track shows DNaseI sensitivity measured genome-wide in different cell lines using the Digital DNaseI methodology (see below), and DNaseI hypersensitive sites. DNaseI has long been used to map general chromatin accessibility and DNaseI hypersensitivity is a universal feature of active cis-regulatory sequences. The use of this method has led to the discovery of functional regulatory elements that include enhancers, insulators, promotors, locus control regions and novel elements. For each experiment (cell type) this track shows DNaseI sensitivity as a continuous function using sequencing tag density (Raw Signal), and discrete loci of DNaseI sensitive zones (HotSpots) and hypersensitive sites (Peaks)." For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols. Digital DNaseI was performed by DNaseI digestion of intact nuclei, isolating DNaseI 'double-hit' fragments as described in Sabo et al. (2006), and direct sequencing of fragment ends (which correspond to in vivo DNaseI cleavage sites) using the Solexa platform (36 bp reads). Uniquely mapping high-quality reads were mapped to the genome. DNaseI sensitivity is directly reflected in raw tag density (Raw Signal), which is shown in the track as density of tags mapping within a 150 bp sliding window (at a 20 bp step across the genome). DNaseI sensitive zones (HotSpots) were identified using the HotSpot algorithm described in Sabo et al. (2004). 1.0% false discovery rate thresholds (FDR 0.01) were computed for each cell type by applying the HotSpot algorithm to an equivalent number of random uniquely mapping 36mers. DNaseI hypersensitive sites (DHSs or Peaks) were identified as signal peaks within FDR 1.0% hypersensitive zones using a peak-finding algorithm.

Project description:This track is produced as part of the mouse ENCODE Project. This track shows DNaseI sensitivity measured genome-wide in mouse tissues and cell lines using the Digital DNaseI methodology (see below), and DNaseI hypersensitive sites. DNaseI has long been used to map general chromatin accessibility and DNaseI hypersensitivity is a universal feature of active cis-regulatory sequences. The use of this method has led to the discovery of functional regulatory elements that include enhancers, insulators, promotors, locus control regions and novel elements. For each experiment (tissue/cell type) this track shows DNaseI sensitivity as a continuous function using sequencing tag density (Signal), and discrete loci of DNaseI sensitive zones (HotSpots) and hypersensitive sites (Peaks). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse). Fresh tissues were harvested from mice and the nuclei prepared according to the tissue appropriate protocol (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse). Digital DNaseI was performed by DNaseI digestion of intact nuclei, isolating DNaseI 'double-hit' fragments as described in Sabo et al. (2006), and direct sequencing of fragment ends (which correspond to in vivo DNaseI cleavage sites) using the Illumina IIx (and Illumina HiSeq by early 2011) platform (36 bp reads). Uniquely mapping high-quality reads were mapped to the genome using the bowtie aligner. DNaseI sensitivity is directly reflected in raw tag density, which is shown in the track as density of tags mapping within a 150 bp sliding window (at a 20 bp step across the genome). DNaseI sensitive zones (HotSpots) were identified using the HotSpot algorithm described in Sabo et al. (2004). 1.0% false discovery rate thresholds (FDR 0.01) were computed for each cell type by applying the HotSpot algorithm to an equivalent number of random uniquely mapping 36mers. DNaseI hypersensitive sites (DHSs or Peaks) were identified as signal peaks within FDR 1.0% hypersensitive zones using a peak-finding algorithm (I-max).

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Scott Tenenbaum mailto:STenenbaum@uamail.albany.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). The RNA binding protein (RBP) associated mRNA sequencing track (RIP-Seq) is produced as part of the Encyclopedia of DNA Elements (ENCODE) Project (http://hgwdev.cse.ucsc.edu/ENCODE/index.html). This track displays transcriptional fragments associated with RBP in cell lines (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=cellType) K562 and GM12878, using Ribonomic profiling via Illumina SBS. In eukaryotic organisms gene regulatory networks require an additional level of coordination that links transcriptional and post-transcriptional processes. Messenger RNAs have traditionally been viewed as passive molecules in the pathway from transcription to translation. However, it is now clear that RNA-binding proteins play a major role in regulating multiple mRNAs in order to facilitate gene expression patterns. These tracks show the associated mRNAs that co-precipitate with the targeted RNA-binding proteins using RIP-Seq profiling. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf RBP-mRNA complexes were purified from cells grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell). RNA samples were amplified and converted to cDNA with the Nugen (http://www.nugeninc.com/) Ovation© RNA-Seq System and prepped for sequencing with the Illumina (http://www.illumina.com/) mRNA-Seq protocol. Approximately 30 million single end sequencing reads were obtained for each K562 and GM12878. RIP samples were analyzed for signal that was at or above the 60th percentile and statistically enriched compared to the negative control. Sequences were analyzed using TopHat (http://tophat.cbcb.umd.edu/) (Trapnell et al., 2009) with Bowtie (http://bowtie-bio.sourceforge.net/index.shtml) (Langmead et al., 2009). Peaks were called from the top 40% of TopHat normalized reads, with a max gap, min run of (24:48). Unions of overlapping peak regions from total RNA replicates (RIP-Input) are presented with p-value from a one tailed t-test for average signal from replicates versus 0 (no cut-off was used for totals). Replicate overlap for positive RIP treatment peaks (ELAVL1 and PABPC1) are presented with a p-value from one tailed t-test versus signal for same the region in negative control replicates (T7-tag). RIP peaks were from sequences longer than 120 bp and p-value < .05. For both totals (RIP-input) and RIPs, the peak scores are scaled relative p-values between treatment and control.

Dataset Information

CUT&Tag recovers up to half of ENCODE ChIP-seq peaks

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure