Dataset Information


Whole Genome Bisulfite Sequencing by ENCODE/HAIB

ABSTRACT: This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (). If you have questions about the Genome Browser track associated with this data, contact ENCODE ( This track was produced as part of the ENCODE project. It reports the percentage of DNA molecules that exhibit cytosine methylation. In general, DNA methylation within a gene's promoter is associated with gene silencing, and DNA methylation within the exons and introns of a gene is associated with gene expression. Proper regulation of DNA methylation is essential during development and aberrant DNA methylation is a hallmark of cancer. DNA methylation status was assayed with Whole Genome Bisulfite Sequencing (WGBS). Genomic DNA was sheared by sonication, end-repaired and then ligated to methylated sequencing adapters. The library fragments were treated with sodium bisulfite and amplified by PCR to convert every unmethylated cytosine to a thymine while leaving methylated cytosines intact. The sequenced fragments were aligned to a bisulfite-converted reference genome. For each assayed cytosine, the number of sequencing reads covering that C and the percentage of those reads that were methylated were reported. For data usage terms and conditions, please refer to and DNA methylation at cytosines across the genome was assayed with Whole Genome Bisulfite Sequencing (WGBS). WGBS was performed on cell lines grown by ENCODE production groups. WGBS was carried out by the Myers production group at the HudsonAlpha Institute for Biotechnology. Isolation of Genomic DNA: Genomic DNA was isolated from each cell line using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA concentrations for each genomic DNA preparation were determined using fluorescent DNA-binding dye and a fluorometer (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer). Typically, 2 µg of genomic DNA is used to make WGBS libraries. WGBS Library Construction and Sequencing: WGBS library construction started with sonication of genomic DNA on a Covaris S2 instrument. Sheared ends were then repaired and blunted with DNA polymerase I, T4 DNA polymerase and T4 polynucleotide kinase in the presence of dATP, dGTP and dTTP. After end repair, Klenow exo- DNA Polymerase was used to add an adenosine as a 3' overhang. Next, a methylated version of the Illumina paired-end adapters was ligated onto the DNA. Adapter-ligated 400 bp genomic DNA fragments were selected using a 2% agarose SizeSelect E-gel. The selected adapter-ligated fragments were treated with sodium bisulfite using the Zymo Research EZ DNA Methylation Gold Kit, which converts unmethylated cytosines to uracils and leaves methylated cytosines unchanged. Bisulfite-treated DNA was amplified in a final PCR reaction which was optimized to uniformly amplify diverse fragment sizes and sequence contexts in the same reaction. During this final PCR reaction, uracils were copied as thymines, resulting in a thymine in the PCR products wherever an unmethylated cytosine existed in the genomic DNA. These libraries were then sequenced with an Illumina HiSeq 2000 according to the manufacturer's recommendations as paired-end 50 bp reads. Libraries were sequenced to a depth of 600 million aligned reads. Data Analysis: To analyze the sequence data, Bismark (Krueger and Andrews, 2011) was used to align sequences reads. Generally, each read went through a conversion of Cs to Ts and was then aligned to fully converted plus and minus strands of the hg19 build of the human genome. A few custom refinements were made to the Bismark program. Since these libraries were made in a directional orientation with the first read always being C-poor, we skipped unnecessary alignments to impossible orientations. We also implemented a more stringent uniqueness filter, only allowing reads that have one acceptable alignment (based on default Bowtie parameters) across both strands. Once reads were aligned, the percent methylation was calculated for each cytosine using the original sequence reads. The percent methylation and number of reads is reported for each CpG in the wgEncodeHaibMethylWgbsXXXXCpg.bigBed file and for each non CpG cytosine in the wgEncodeHaibMethylWgbsXXXXNoncpg.bigBed file.

ORGANISM(S): Homo sapiens  

SUBMITTER: Richard Meyers   UCSC ENCODE DCC  Florencia Pauli 

PROVIDER: E-GEOD-40832 | ArrayExpress | 2012-09-12



Similar Datasets

2011-03-01 | E-GEOD-27584 | ArrayExpress
2014-09-11 | E-GEOD-58217 | ArrayExpress
2011-06-25 | E-GEOD-30179 | ArrayExpress
2014-02-18 | E-GEOD-34425 | ArrayExpress
| GSE34425 | GEO
| GSE36845 | GEO
2012-06-01 | E-GEOD-37454 | ArrayExpress
2010-05-11 | E-MEXP-2698 | ArrayExpress
2012-06-01 | GSE37454 | GEO
2013-07-03 | E-GEOD-41701 | ArrayExpress