Genotyping of E14 mouse embryonic stem cells by sequencing
ABSTRACT: More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling. genotyping of E14 embryonic stem cells (ESCs) and Reduced representation Bisulfite Sequencing (RRBS) of E14 ESCs.
Project description:More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling. Overall design: genotyping of E14 embryonic stem cells (ESCs) and Reduced representation Bisulfite Sequencing (RRBS) of E14 ESCs.
Project description:DNA methylation is catalysed by DNA methyltransferases (DNMTs) and is necessary for a correct embryonic development. On the other hand, the DNA demethylation is mediated by the Ten Eleven Translocation (Tet) proteins through oxidation of 5-methyl cytosine (5mC) to 5-hydroxyl (5hmC), 5-formyl (5fC) and 5-carboxyl (5caC) cytosine, and by the Thymine-DNA glycosylase (TDG) that excises the 5fC and 5caC. In embryonic stem cells (ESCs), gene promoters are maintained in an hypomethylated state, but the dynamics of this phenomenon still remains unknown. Here we present a genome-wide approach, named methylation-assisted bisulfite sequencing (MAB-Seq) that enables single-base resolution mapping of 5fC and 5caC and measuring of their relative abundance. Application of this method to mouse ESCs exposed the presence of 5fcaC residues on the hypomethylated promoters of the expressed genes, revealing an active DNA demethylation mechanism since the loss of TDG leads to an increase of 5fC/5caC. We also show that TDG is actually bound on these regions and that co-localizes and interacts with Tet1. We moreover demonstrate, by reduced representation of bisulfite sequencing (RRBS), that active promoters are actually demethylated by a Tet-dependent mechanism and that Dnmt1 and Dnmt3a are responsible of this DNA methylation. Our work shows the whole-genome map of 5fC and 5caC at single base resolution in ESCs, it demonstrates in detail the DNA methylation dynamics occurring on expressed gene promoters and identifies the key players of this mechanism. Furthermore, we provide a new tool (MAB-Seq) that can be broadly used in all biological contexts for epigenetics study involving identification and quantification of 5fC and 5caC at single base resolution. Methylation-assisted bisulfite sequencing (MAB-Seq) of E14 embryonic stem cells (ESCs), Biotag ChIP-Seq of Tdg and Reduced representation Bisulfite Sequencing (RRBS) in E14 ESCs.
Project description:Up until now, the existence of Dnmt2-mediated DNA methylation has mostly been supported by focal analyses in organisms that contain Dnmt2, but no Dnmt1 or Dnmt3 DNA methyltransferase. In these organisms, several independent studies have also provided support for a biologically important function of Dnmt2-dependent DNA methylation. For example, Dnmt2-dependent methylation in Entamoeba histolytica, the causative agent of amebic dysentery, has been connected to the parasite s virulence. However, global DNA methylation levels in Entamoeba have been found to be very low. In addition, no specific features, such as CpG-specificity and specificity for certain genetic subcompartments have been described. This distinguishes Dnmt2-dependent methylation patterns from all other known methylomes and has raised questions about the validity of the underlying results. We have used whole-genome bisulfite sequencing for an unbiased characterization of the Entamoeba histolytica methylome at single-base resolution in a E.histolytica strain HM-1:IMSS devoid of significant level of EhDnmt2 (Ehmeth) expression. Paired-end BS-sequencing was performed on an Illumina Genome Analyzer with read lengths of 105 base pairs and an average insert size of 200 bp.
Project description:Recent large-scale studies have defined genomewide, cell type-specific patterns of DNA methylation, a modification known to be important for regulating gene expression in both normal development and disease states. However, determining the functional significance of specific methylation events remains a challenging problem due to the current lack of targeted methodologies for removing these modifications. Here we describe an approach for efficient targeted demethylation of specific CpGs in human cells using fusions of engineered transcription activator-like effector (TALE) repeat arrays and the TET1 hydroxylase catalytic domain. Using these TALE-TET1 fusions, we demonstrate that modification of certain critical methylated promoter CpG positions can be associated with substantial increases in endogenous human gene expression. Our results delineate a general strategy for defining the functional significance of specific CpG methylation marks in the context of endogenous gene loci and validate new programmable DNA demethylation reagents with broad utility for research and potential therapeutic applications. Bisulfite sequencing of three different loci in three different cell lines (Klf4 in K562s, HBB in K562s and RHOXF2 in 293s and HeLas. Biological triplicates of all samples and controls (off-target and GFP controls).
Project description:The de novo DNA methyltransferase 3-like (Dnmt3L) is a catalytically inactive DNA methylase that has been previously shown to cooperate with Dnmt3a and Dnmt3b to methylate DNA. Dnmt3L is highly expressed in mouse embryonic stem cells (ESC) but its function in these cells is unknown. We here report that Dnmt3L is required for the differentiation of ESC into primordial germ cells (PGC) through activation of the homeotic gene Rhox5. By genome-wide analysis we found that Dnmt3L is a positive regulator of methylation at gene bodies of housekeeping genes and a negative regulator of methylation at promoters of bivalent genes. We demonstrate that Dnmt3L interacts with the Polycomb PRC2 complex in competition with the DNA methyl transferases Dnmt3a and Dnmt3b to maintain low the methylation level at H3H27me3 regions. Thus in ESC, Dnmt3L counteracts the activity of de novo DNA methylases to keep low the level of DNA methylation at developmental gene promoters. Examination of 5mC in shGFP and shDnmt3L ESC by MeDIP-Seq
Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Florencia Pauli mailto:email@example.com). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:firstname.lastname@example.org). This track shows average methylation status in CpG islands. In general, methylation of CpG sites within a promoter causes silencing of the gene associated with that promoter For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf CpG regions were assayed via Methyl-seq, a method developed in the Myers laboratory to measure the methylation status at CpGs throughout the genome. It combines DNA digestion by a methyl-sensitive enzyme HpaII and its methyl-insensitive isoschizomer MspI with the Illumina DNA sequencing platform. The method was first applied in a collaboration with the laboratory of Dr. Julie Baker at Stanford University to study methylation and gene expression changes that occur in human embryonic stem cells before and after differentiation to definitive endoderm. A paper describing the results as well as the method has been submitted for publication . This study profiled genomic DNA and mRNA samples derived from two human embryonic stem cell lines: H9 and BG02. These cells were differentiated into definitive endoderm, embryoid bodies, embryoid body-derived cells, and AFP+ (alpha-fetoprotein positive) hepatocytes. These in vitro samples were profiled with Methyl-seq and compared them with normal tissue samples from 11-week and 24-week fetal liver and adult liver. Methyl-seq assays more than 250,000 methyl-sensitive restriction enzyme cleavage sites, representing more than 90,000 genomic regions. These regions include 35,528 annotated CpG islands, while the remaining 55,084 non-CpG island regions are distributed across the genome in promoters, genes, and intergenic regions. Sequence tags present in MspI libraries but not in HpaII libraries are derived from methylated regions. Conversely, sequence tags that occur in HpaII libraries come from at least partially unmethylated regions. In vitro differentiation: Definitive endoderm precursor cells were generated from H9 hES cells by treating them with activin A. Embryoid bodies (EBs) were generated by growing undifferentitated H9 and BG02 hESCs in suspension. EB-derived cells were obtained by plating clumps of the cells from the EBs. AFP+ fetal hepatocytes were derived from EBs by plating EB cells with FgF, followed by fluorescence activated cell sorting (FACS) to isolate cells expressing the green fluorescent protein (GFP) reporter gene driven from the AFP promoter. Isolation of genomic DNA: Genomic DNA is isolated from biological replicates of each cell line by using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA concentrations and a level of quality of each preparation is determined by UV absorbance. HpaII and MspI digestions: Cleavage of DNA by restriction endonuclease HpaII is prevented by the presence of a 5-methyl group at the internal C residue of its recognition sequence CCGG. MspI, an isoschizomer of HpaII, cleaves DNA irrespective of the presence of a methyl group at this position. For the MspI library, 5 µg genomic DNA was digested in a 100 µl reaction with 1X NEB Buffer2 and 20 units MspI restriction enzyme and incubated for 18 hr at 37°C. For the HpaII library, 5 µg genomic DNA was digested in a 100 µl reaction with 1X NEB Buffer1 and 20 units HpaII restriction enzyme and incubated for 18 hr at 37°C. Note that in subsequent versions of the Methyl-seq protocol, which will be described later, much lower amounts of genomic DNA were used (1 µg and potentially lower). DNA library construction and sequencing: High-throughput sequencing libraries were generated from DNA fragments of the HpaII or MspI digested genomic DNA according to the protocol posted at the website: http://myers.hudsonalpha.org/content/protocols.html. This approach was recently modified by removing the first PCR amplification step, just prior to the gel electrophoresis size-selection step, which was found to reduce a fragment-size bias in the sequencing libraries. These libraries were sequenced with an Illumina Genome Analyzer (GA2) according to the manufacturer's recommendations. Data analysis: For this analyis, reads that align to human genome sequence version hg19 and contain the 5'-CGG-3' HpaII-cut signature on their 5' end were used. These aligned sequence reads were mapped to CCGG sites predicted in silico on hg19. Sites with four or more Msp1 tags occurring in either the forward or reverse direction were retained for analysis. These "assayable" sites were then grouped with neighboring sites that are within 35-75 bp of each other. Thus, a "region" can be comprised of between 2 and 18 digestion sites that are each within 35-75 bp of another site. Methylated and non-methylated calls were made by using HpaII tag data from all assayable cut sites. For each site across each region, the larger of either the forward read count or reverse read count was used. Regions that have an average of 0 or 1 read per cut site are called methylated, and regions with more than one sequence read per site are called unmethylated.
Project description:DNA methylation is a defining feature of mammalian cellular identity and is essential for normal development. Most cell types, except germ cells and pre-implantation embryos, display relatively stable DNA methylation patterns, with 70-80% of all CpGs being methylated. Despite recent advances we still have a too limited understanding of when, where and how many CpGs participate in genomic regulation. Here we report the in depth analysis of 42 whole genome bisulfite sequencing (WGBS) data sets across 30 diverse human cell and tissue types. We observe dynamic regulation for only 21.8% of autosomal CpGs within a normal developmental context, a majority of which are distal to transcription start sites. These dynamic CpGs co-localize with gene regulatory elements, particularly enhancers and transcription factor binding sites (TFBS), which allow identification of key lineage specific regulators. In addition, differentially methylated regions (DMRs) often harbor SNPs associated with cell type related diseases as determined by GWAS. The results also highlight the general inefficiency of WGBS as 70-80% of the sequencing reads across these data sets provided little or no relevant information regarding CpG methylation. To further demonstrate the utility of our DMR set, we use it to classify unknown samples and identify representative signature regions that recapitulate major DNA methylation dynamics. In summary, although in theory every CpG can change its methylation state, our results confirm that only a fraction does so as part of coordinated regulatory programs. Therefore our selected DMRs can serve as a starting point to help guide novel, more effective reduced representation approaches to capture the most informative fraction of CpGs as well as further pinpoint putative regulatory elements. Analysis of 42 human WGBS libraries comprising 30 distinct primary cell lines, tissues, in vitro derived cell types and cell lines. BiSeq raw sequencing reads were aligned using maq in bisulfite mode (Li et al. 2008) or bsmap 2.7 (Xi et al. 2009) against human genome version hg19/GRCh37, discarding duplicate reads. DNA methylation calling was performed based on an extended custom software pipeline published previously for RRBS (Gu et al., 2010). The bed files contain all seen CpGs within the given library. The number of methylated reads/number of total reads is listed in the score column. This study includes a re-analysis of Samples from the NIH Roadmap Epigenomics Mapping Consortium (REMC; GSE16256, GSE17312), Hodges et al. 2011 (GSE31971), Molaro et al. 2011 (GSE30340), and Xi et al. 2013. WARNING: This submission is incomplete pending further deposits of raw data files from Meissner via EDACC.
Project description:5-hydroxymethylcytosine (5hmC) is a recently discovered epigenetic modification that is lost in human cancers. Formation of 5hmC is catalysed by the Ten eleven translocation (TET) proteins that mediate the sequential oxidation of 5-methylcytosine (5mC) to 5hmC, leading to eventual DNA demethylation. Several mechanisms can lead to loss of 5hmC in cancers, including mutations in IDH or TET2 genes. However, little is known about the role of TET proteins and 5hmC in adult cells. Here, we report that TET1 downmodulation is required to permit adult cells to proliferate. TET1 is rapidly downmodulated in proliferating primary cells and in regenerating liver. TET1 silencing accelerates cell cycle progression while its constitutive expression inhibits cell growth. TET1 is a negative regulator of cell proliferation and it is regulated during development in tissue specific manner. These findings enlarge our knowledge on how one epigenetic modification such as the DNA hydroxymethylation mediated by TET1 is a key player on the control of cell proliferation. Examination of 5hmC in MEF at passage 0 and at passage 5.
Project description:Genetic and epigenetic alterations are essential for the initiation and progression of human cancer. We previously reported that primary human medulloblastomas showed extensive cancer-specific CpG island DNA hypermethylation in critical developmental pathways. To determine whether genetically engineered mouse models (GEMMs) of medulloblastoma have comparable epigenetic changes, we assessed genome-wide DNA methylation in three mouse models of medulloblastoma. In contrast to human samples, very few loci with cancer-specific DNA hypermethylation were detected, and in almost all cases the degree of methylation was relatively modest compared to the dense hypermethylation in the human cancers. To determine if this finding was common to other GEMMs, we examined a Burkitt lymphoma and breast cancer model and did not detect promoter CpG island DNA hypermethylation, suggesting that human cancers and at least some GEMMs are fundamentally different with respect to this epigenetic modification. These findings provide an opportunity to both better understand the mechanism of aberrant DNA methylation in human cancer and construct better GEMMs to serve as preclinical platforms for therapy development. Examination of DNA methylation in one representative human medulloblastoma patient sample and three different mouse models of medulloblastoma using RRBS
Project description:The TET3 CXXC domain has unique DNA binding properties. It binds to DNA in a cytosine-dependent manner that prefers binding to CpG dinucleotides but is not restricted by the CpG-content, distinct from other well-characterized CXXC domains. To map the TET3 CXXC domain binding sites across the human genome, we purified the GST-tagged TET3 CXXC domain protein and performed the GST pull-down assay using the genomic DNA purified from HEK293T cells. The enriched DNA fragments were then sequenced and aligned to human genome(hg19). We used the GST pull-down assay followed by DNA deep sequencing to map the DNA bound by the TET3 CXXC domain in vitro.