LTR retrotransposons transcribed in oocytes drive species-specific and heritable changes in DNA methylation.
ABSTRACT: De novo DNA methylation (DNAme) during mouse oogenesis occurs within transcribed regions enriched for H3K36me3. As many oocyte transcripts originate in long terminal repeats (LTRs), which are heterogeneous even between closely related mammals, we examined whether species-specific LTR-initiated transcription units (LITs) shape the oocyte methylome. Here we identify thousands of syntenic regions in mouse, rat, and human that show divergent DNAme associated with private LITs, many of which initiate in lineage-specific LTR retrotransposons. Furthermore, CpG island (CGI) promoters methylated in mouse and/or rat, but not human oocytes, are embedded within rodent-specific LITs and vice versa. Notably, at a subset of such CGI promoters, DNAme persists on the maternal genome in fertilized and parthenogenetic mouse blastocysts or in human placenta, indicative of species-specific epigenetic inheritance. Polymorphic LITs are also responsible for disparate DNAme at promoter CGIs in distantly related mouse strains, revealing that LITs also promote intra-species divergence in CGI DNAme.
Project description:De novo DNA methylation (DNAme) occurs coincident with transcription during mouse oogenesis. As many oocyte transcripts originate in Long Terminal Repeats (LTRs), which are divergent across species, we examined whether polymorphic LTR-initiated transcription units (LITs) shape the oocyte methylome. We identified thousands of syntenic regions in mouse, rat and human, including CpG islands (CGIs), that show divergent DNAme associated with polymorphic LITs. Notably, many CGI promoters methylated exclusively in mouse and/or rat are embedded within rodent-specific LITs, and show persistent maternal methylation in the blastocyst. Polymorphic LITs are also responsible for divergent methylation of CGI promoters in distantly related mouse strains, revealing that LITs also promote intra-species diversification of promoter DNAme. Overall design: Total RNA-seq in mouse and rat oocytes; H3K36me3 ChIP-seq in mouse oocytes
Project description:De novo DNA methylation (DNAme) during mammalian spermatogenesis yields a densely methylated genome, with the exception of CpG islands (CGIs), which are hypomethylated in sperm. While the paternal genome undergoes widespread DNAme loss before the first S-phase following fertilization, recent mass spectrometry analysis revealed that the zygotic paternal genome is paradoxically also subject to a low level of de novo DNAme. However, the loci involved, and impact on transcription were not addressed. Here, we employ allele-specific analysis of whole-genome bisulphite sequencing data and show that a number of genomic regions, including several dozen CGI promoters, are de novo methylated on the paternal genome by the 2-cell stage. A subset of these promoters maintains DNAme through development to the blastocyst stage. Consistent with paternal DNAme acquisition, many of these loci are hypermethylated in androgenetic blastocysts but hypomethylated in parthenogenetic blastocysts. Paternal DNAme acquisition is lost following maternal deletion of Dnmt3a, with a subset of promoters, which are normally transcribed from the paternal allele in blastocysts, being prematurely transcribed at the 4-cell stage in maternal Dnmt3a knockout embryos. These observations uncover a role for maternal DNMT3A activity in post-fertilization epigenetic reprogramming and transcriptional silencing of the paternal genome.
Project description:Human and mouse genomes contain a similar number of CpG islands (CGIs), which are discrete CpG-rich DNA sequences associated with transcription start sites. In both species, ?50% of all CGIs are remote from annotated promoters but, nevertheless, often have promoter-like features. To determine the role of CGI methylation in cell differentiation, we analyzed DNA methylation at a comprehensive CGI set in cells of the mouse hematopoietic lineage. Using a method that potentially detects ?33% of genomic CpGs in the methylated state, we found that large differences in gene expression were accompanied by surprisingly few DNA methylation changes. There were, however, many DNA methylation differences between hematopoietic cells and a distantly related tissue, brain. Altered DNA methylation in the immune system occurred predominantly at CGIs within gene bodies, which have the properties of cell type-restricted promoters, but infrequently at annotated gene promoters or CGI flanking sequences (CGI "shores"). Unexpectedly, elevated intragenic CGI methylation correlated with silencing of the associated gene. Differentially methylated intragenic CGIs tended to lack H3K4me3 and associate with a transcriptionally repressive environment regardless of methylation state. Our results indicate that DNA methylation changes play a relatively minor role in the late stages of differentiation and suggest that intragenic CGIs represent regulatory sites of differential gene expression during the early stages of lineage specification.
Project description:CpG islands (CGIs) are vertebrate genomic landmarks that encompass the promoters of most genes and often lack DNA methylation. Querying their apparent importance, the number of CGIs is reported to vary widely in different species and many do not co-localise with annotated promoters. We set out to quantify the number of CGIs in mouse and human genomes using CXXC Affinity Purification plus deep sequencing (CAP-seq). We also asked whether CGIs not associated with annotated transcripts share properties with those at known promoters. We found that, contrary to previous estimates, CGI abundance in humans and mice is very similar and many are at conserved locations relative to genes. In each species CpG density correlates positively with the degree of H3K4 trimethylation, supporting the hypothesis that these two properties are mechanistically interdependent. Approximately half of mammalian CGIs (>10,000) are "orphans" that are not associated with annotated promoters. Many orphan CGIs show evidence of transcriptional initiation and dynamic expression during development. Unlike CGIs at known promoters, orphan CGIs are frequently subject to DNA methylation during development, and this is accompanied by loss of their active promoter features. In colorectal tumors, however, orphan CGIs are not preferentially methylated, suggesting that cancer does not recapitulate a developmental program. Human and mouse genomes have similar numbers of CGIs, over half of which are remote from known promoters. Orphan CGIs nevertheless have the characteristics of functional promoters, though they are much more likely than promoter CGIs to become methylated during development and hence lose these properties. The data indicate that orphan CGIs correspond to previously undetected promoters whose transcriptional activity may play a functional role during development.
Project description:DNA methylation at the promoter of a gene is presumed to render it silent, yet a sizable fraction of genes with methylated proximal promoters exhibit elevated expression. Here, we show, through extensive analysis of the methylome and transcriptome in 34 tissues, that in many such cases, transcription is initiated by a distal upstream CpG island (CGI) located several kilobases away that functions as an alternative promoter. Specifically, such genes are expressed precisely when the neighboring CGI is unmethylated but remain silenced otherwise. Based on CAGE and Pol II localization data, we found strong evidence of transcription initiation at the upstream CGI and a lack thereof at the methylated proximal promoter itself. Consistent with their alternative promoter activity, CGI-initiated transcripts are associated with signals of stable elongation and splicing that extend into the gene body, as evidenced by tissue-specific RNA-seq and other DNA-encoded splice signals. Furthermore, based on both inter- and intra-species analyses, such CGIs were found to be under greater purifying selection relative to CGIs upstream of silenced genes. Overall, our study describes a hitherto unreported conserved mechanism of transcription of genes with methylated proximal promoters in a tissue-specific fashion. Importantly, this phenomenon explains the aberrant expression patterns of some cancer driver genes, potentially due to aberrant hypomethylation of distal CGIs, despite methylation at proximal promoters.
Project description:CpG islands (CGIs) are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.
Project description:The temporal and spatial expression of genes is controlled by promoters and enhancers. Findings obtained over the last decade that not only promoters but also enhancers are characterized by bidirectional, divergent transcription have challenged the traditional notion that promoters and enhancers represent distinct classes of regulatory elements. Over half of human promoters are associated with CpG islands (CGIs), relatively CpG-rich stretches of generally several hundred nucleotides that are often associated with housekeeping genes. Only about 6% of transcribed enhancers defined by CAGE-tag analysis are associated with CGIs. Here, we present an analysis of enhancer and promoter characteristics and relate them to the presence or absence of CGIs. We show that transcribed enhancers share a number of CGI-dependent characteristics with promoters, including statistically significant local overrepresentation of core promoter elements. CGI-associated enhancers are longer, display higher directionality of transcription, greater expression, a lesser degree of tissue specificity, and a higher frequency of transcription-factor binding events than non-CGI-associated enhancers. Genes putatively regulated by CGI-associated enhancers are enriched for transcription regulator activity. Our findings show that CGI-associated transcribed enhancers display a series of characteristics related to sequence, expression and function that distinguish them from enhancers not associated with CGIs.
Project description:CpG islands (CGIs) are commonly used as genomic markers to study the patterns and regulatory consequences of DNA methylation. Interestingly, recent studies reveal a substantial diversity among CGIs: long and short CGIs, for example, exhibit contrasting patterns of gene expression complexity and nucleosome occupancy. Evolutionary origins of CGIs are also highly heterogeneous. In order to systematically evaluate potential diversities among CGIs and ultimately to illuminate the link between diversity of CGIs and their epigenetic variation, we analyzed the nucleotide-resolution DNA methylation maps (methylomes) of multiple cellular origins. We discover novel 'clusters' of CGIs according to their patterns of DNA methylation; the stably hypomethylated CGI cluster (cluster I), sperm-hypomethylated CGI cluster (cluster II), and variably methylated CGI cluster (cluster III). These epigenomic CGI clusters are strikingly distinct at multiple biological features including genomic, evolutionary, and functional characteristics. At the genomic level, the stably hypomethylated CGI cluster tends to be longer and harbors many more CpG dinucleotides than those in other clusters. They are also frequently associated with promoters, while CGI clusters II and III mostly reside in intragenic or intergenic regions and exhibit highly tissue-specific DNA methylation. Functional ontology terms and transcriptional profiles co-vary with CGI clusters, indicating that the regulatory functions of CGIs are tightly linked to their heterogeneity. Finally, CGIs associated with distinctive biological processes, such as diseases, aging, and imprinting, occur disproportionately across CGI clusters. These new findings provide an effective means to combine existing knowledge on CGIs into a genomic context while bringing new insights that elucidate the significance of DNA methylation across different biological conditions and demography.
Project description:CpG islands (CGIs) are associated with the majority of mammalian gene promoters and function to recruit chromatin modifying enzymes. It has therefore been proposed that CGIs regulate gene expression through chromatin-based mechanisms, however in most cases this has not been directly tested. Here, we reveal that the histone H3 lysine 36 (H3K36) demethylase activity of the CGI-binding KDM2 proteins contributes only modestly to the H3K36me2-depleted state at CGI-associated gene promoters and is dispensable for normal gene expression. Instead, we discover that KDM2 proteins play a widespread and demethylase-independent role in constraining gene expression from CGI-associated gene promoters. We further show that KDM2 proteins shape RNA Polymerase II occupancy but not chromatin accessibility at CGI-associated promoters. Together this reveals a demethylase-independent role for KDM2 proteins in transcriptional repression and uncovers a new function for CGIs in constraining gene expression.
Project description:CpG islands (CGIs) are prominent in the mammalian genome owing to their GC-rich base composition and high density of CpG dinucleotides. Most human gene promoters are embedded within CGIs that lack DNA methylation and coincide with sites of histone H3 lysine 4 trimethylation (H3K4me3), irrespective of transcriptional activity. In spite of these intriguing correlations, the functional significance of non-methylated CGI sequences with respect to chromatin structure and transcription is unknown. By performing a search for proteins that are common to all CGIs, here we show high enrichment for Cfp1, which selectively binds to non-methylated CpGs in vitro. Chromatin immunoprecipitation of a mono-allelically methylated CGI confirmed that Cfp1 specifically associates with non-methylated CpG sites in vivo. High throughput sequencing of Cfp1-bound chromatin identified a notable concordance with non-methylated CGIs and sites of H3K4me3 in the mouse brain. Levels of H3K4me3 at CGIs were markedly reduced in Cfp1-depleted cells, consistent with the finding that Cfp1 associates with the H3K4 methyltransferase Setd1 (refs 7, 8). To test whether non-methylated CpG-dense sequences are sufficient to establish domains of H3K4me3, we analysed artificial CpG clusters that were integrated into the mouse genome. Despite the absence of promoters, the insertions recruited Cfp1 and created new peaks of H3K4me3. The data indicate that a primary function of non-methylated CGIs is to genetically influence the local chromatin modification state by interaction with Cfp1 and perhaps other CpG-binding proteins.