Efficient and accurate determination of genome-wide DNA methylation patterns in Arabidopsis thaliana with enzymatic methyl sequencing
ABSTRACT: Background 5? methylation of cytosines in DNA molecules is an important epigenetic mark in eukaryotes. Bisulfite sequencing is the gold standard of DNA methylation detection, and whole-genome bisulfite sequencing (WGBS) has been widely used to detect methylation at single-nucleotide resolution on a genome-wide scale. However, sodium bisulfite is known to severely degrade DNA, which, in combination with biases introduced during PCR amplification, leads to unbalanced base representation in the final sequencing libraries. Enzymatic conversion of unmethylated cytosines to uracils can achieve the same end product for sequencing as does bisulfite treatment and does not affect the integrity of the DNA; enzymatic methylation sequencing may, thus, provide advantages over bisulfite sequencing. Results Using an enzymatic methyl-seq (EM-seq) technique to selectively deaminate unmethylated cytosines to uracils, we generated and sequenced libraries based on different amounts of Arabidopsis input DNA and different numbers of PCR cycles, and compared these data to results from traditional whole-genome bisulfite sequencing. We found that EM-seq libraries were more consistent between replicates and had higher mapping and lower duplication rates, lower background noise, higher average coverage, and higher coverage of total cytosines. Differential methylation region (DMR) analysis showed that WGBS tended to over-estimate methylation levels especially in CHG and CHH contexts, whereas EM-seq detected higher CG methylation levels in certain highly methylated areas. These phenomena can be mostly explained by a correlation of WGBS methylation estimation with GC content and methylated cytosine density. We used EM-seq to compare methylation between leaves and flowers, and found that CHG methylation level is greatly elevated in flowers, especially in pericentromeric regions. Conclusion We suggest that EM-seq is a more accurate and reliable approach than WGBS to detect methylation. Compared to WGBS, the results of EM-seq are less affected by differences in library preparation conditions or by the skewed base composition in the converted DNA. It may therefore be more desirable to use EM-seq in methylation studies.
Project description:Whole-genome bisulfite sequencing (WGBS) has been widely used to quantify cytosine DNA methylation frequency in an expanding array of cell and tissue types. Because of the denaturing conditions used, this method ultimately leads to the measurement of methylation frequencies at single cytosines. Hence, the methylation frequency of CpG dyads (two complementary CG dinucleotides) can be only indirectly inferred by overlaying the methylation frequency of two cytosines measured independently. Furthermore, hemi-methylated CpGs (hemiCpGs) have not been previously analyzed in WGBS studies. We recently developed in silico strand annealing (iSA), a bioinformatics method applicable to WGBS data, to resolve the methylation status of CpG dyads into unmethylated, hemi-methylated, and methylated. HemiCpGs account for 4-20% of the DNA methylome in different cell types, and some can be inherited across cell divisions, suggesting a role as a stable epigenetic mark. Therefore, it is important to resolve hemiCpGs from fully methylated CpGs in WGBS studies. This protocol describes step-by-step commands to accomplish this task, including dividing alignments by strand, pairing alignments between strands, and extracting single-fragment methylation calls. The versatility of iSA enables its application downstream of other WGBS-related methods such as nasBS-seq (nascent DNA bisulfite sequencing), ChIP-BS-seq (ChIP followed by bisulfite sequencing), TAB-seq, oxBS-seq, and fCAB-seq. iSA is also tunable for analyzing the methylation status of cytosines in any sequence context. We exemplify this flexibility by uncovering the single-fragment non-CpG methylome. This protocol provides enough details for users with little experience in bioinformatic analysis and takes 2-7 h.
Project description:Whole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS) are widely used for measuring DNA methylation levels on a genome-wide scale. Both methods have limitations: WGBS is expensive and prohibitive for most large-scale projects; RRBS only interrogates 6-12% of the CpGs in the human genome. Here, we introduce methylation-sensitive restriction enzyme bisulfite sequencing (MREBS) which has the reduced sequencing requirements of RRBS, but significantly expands the coverage of CpG sites in the genome. We built a multiple regression model that combines the two features of MREBS: the bisulfite conversion ratios of single cytosines (as in WGBS and RRBS) as well as the number of reads that cover each locus (as in MRE-seq). This combined approach allowed us to estimate differential methylation across 60% of the genome using read count data alone, and where counts were sufficiently high in both samples (about 1.5% of the genome), our estimates were significantly improved by the single CpG conversion information. We show that differential DNA methylation values based on MREBS data correlate well with those based on WGBS and RRBS. This newly developed technique combines the sequencing cost of RRBS and DNA methylation estimates on a portion of the genome similar to WGBS, making it ideal for large-scale projects of mammalian genomes.
Project description:<h4>Background</h4>DNA methylation is an epigenetic regulatory form that plays an important role in regulating the gene expression and the tissues development.. However, DNA methylation regulators involved in sheep muscle development remain unclear. To explore the functional importance of genome-scale DNA methylation during sheep muscle growth, this study systematically investigated the genome-wide DNA methylation profiles at key stages of Hu sheep developmental (fetus and adult) using deep whole-genome bisulfite sequencing (WGBS).<h4>Results</h4>Our study found that the expression levels of DNA methyltransferase (DNMT)-related genes were lower in fetal muscle than in the muscle of adults. The methylation levels in the CG context were higher than those in the CHG and CHH contexts, and methylation levels were highest in introns, followed by exons and downstream regions. Subsequently, we identified 48,491, 17, and 135 differentially methylated regions (DMRs) in the CG, CHG, and CHH sequence contexts and 11,522 differentially methylated genes (DMGs). The results of bisulfite sequencing PCR (BSP) correlated well with the WGBS-Seq data. Moreover, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional annotation analysis revealed that some DMGs were involved in regulating skeletal muscle development and fatty acid metabolism. By combining the WGBS-Seq and previous RNA-Seq data, a total of 159 overlap genes were obtained between differentially expressed genes (DEGs) and DMGs (FPKM >?10 and fold change >?4). Finally, we found that 9 DMGs were likely to be involved in muscle growth and metabolism of Hu sheep.<h4>Conclusions</h4>We systemically studied the global DNA methylation patterns of fetal and adult muscle development in Hu sheep, which provided new insights into a better understanding of the epigenetic regulation of sheep muscle development.
Project description:BACKGROUND:Whole genome bisulfite sequencing (WGBS) also known as BS-seq has been widely used to measure the methylation of whole genome at single-base resolution. One of the key steps in the assay is converting unmethylated cytosines into thymines (BS conversion). Incomplete conversion of unmethylated cytosines can introduce false positive methylation call. Developing a quick method to evaluate bisulfite conversion ratio (BCR) is benefit for both quality control and data analysis of WGBS. RESULTS:Here we provide a computational method named "BCREval" to estimate the unconverted rate (UCR) by using telomeric repetitive DNA as native spike-in control. We tested the method by using public WGBS data and found that it is very stable and most of BS conversion assays can achieve>?99.5% efficiency. The non-CpG DNA methylation at telomere fits a binomial model and may result from a random process with very low possibility (the ratio?<?0.4%). And the comparison between BCREval and Bismark (Krueger and Andrews, Bioinformatics 27:1571-1572, 2011), a widely used BCR evaluator, suggests that our algorithm is much faster and more efficient than the latter. CONCLUSION:Our method is a simple but robust method to QC and speculates BCR for WGBS experiments to make sure it achieves acceptable level. It is faster and more efficient than current tools and can be easily integrated into presented WGBS pipelines.
Project description:A combination of bisulfite treatment of DNA and high-throughput sequencing (BS-Seq) can capture a snapshot of a cell's epigenomic state by revealing its genome-wide cytosine methylation at single base resolution. Bismark is a flexible tool for the time-efficient analysis of BS-Seq data which performs both read mapping and methylation calling in a single convenient step. Its output discriminates between cytosines in CpG, CHG and CHH context and enables bench scientists to visualize and interpret their methylation data soon after the sequencing run is completed.Bismark is released under the GNU GPLv3+ licence. The source code is freely available from www.bioinformatics.bbsrc.ac.uk/projects/bismark/.
Project description:Background:DNA methylation plays essential roles in tumor occurrence and stemness maintenance. Tumor-repopulating cells (TRCs) are cancer stem cell (CSC)-like cells with highly tumorigenic and self-renewing abilities, which were selected from tumor cells in soft three-dimensional (3D) fibrin gels. Methods:Here, we presented a genome-wide map of methylated cytosines for time-series samples in TRC selection, in a 3D culture using whole-genome bisulfite sequencing (WGBS). Results:A comparative analysis revealed that the methylation degrees of many differentially methylated genes (DMGs) were increased by the mechanical environment and changed from 2D rigid to 3D soft. DMGs were significantly enriched in stemness-related terms. In 1-day, TRCs had the highest non-CG methylation rate indicating its strong stemness. We found that genes with continuously increasing or decreasing methylation like CREB5/ADAMTS6/LMX1A may also affect the TRC screening process. Furthermore, results showed that stage-specific/common CSCs markers were biased toward changing their methylation in non-CG (CHG and CHH, where H corresponds to A, T, or C) methylation and enriched in gene body region. Conclusions:WGBS provides DNA methylome in TRC screening. It was confirmed that non-CG DNA methylation plays an important role in TRC selection, which indicates that it is more sensitive to mechanical microenvironments and affects TRCs by regulating the expression of stemness genes in tumor cells.
Project description:The development of whole-genome bisulfite sequencing (WGBS) has resulted in a number of exciting discoveries about the role of DNA methylation leading to a plethora of novel testable hypotheses. Methods for constructing sodium bisulfite-converted and amplified libraries have recently advanced to the point that the bottleneck for experiments that use WGBS has shifted to data analysis and interpretation. Here we present empirical evidence for an over-representation of reads from methylated DNA in WGBS. This enrichment for methylated DNA is exacerbated by higher cycles of PCR and is influenced by the type of uracil-insensitive DNA polymerase used for amplifying the sequencing library. Future efforts to computationally correct for this enrichment bias will be essential to increasing the accuracy of determining methylation levels for individual cytosines. It is especially critical for studies that seek to accurately quantify DNA methylation levels in populations that may segregate for allelic DNA methylation states.
Project description:Methylation of cytosine in genomic DNA is a well-characterized epigenetic modification involved in many cellular processes and diseases. Whole-genome bisulfite sequencing (WGBS), such as MethylC-seq and post-bisulfite adaptor tagging sequencing (PBAT-seq), uses the power of high-throughput DNA sequencers and provides genome-wide DNA methylation profiles at single-base resolution. However, the accuracy and consistency of WGBS outputs in relation to the operating conditions of high-throughput sequencers have not been explored.We have used the Illumina HiSeq platform for our PBAT-based WGBS, and found that different versions of HiSeq Control Software (HCS) and Real-Time Analysis (RTA) installed on the system provided different global CpG methylation levels (approximately 5% overall difference) for the same libraries. This problem was reproduced multiple times with different WGBS libraries and likely to be associated with the low sequence diversity of bisulfite-converted DNA. We found that HCS was the major determinant in the observed differences. To determine which version of HCS is most suitable for WGBS, we used substrates with predetermined CpG methylation levels, and found that HCS v2.0.5 is the best among the examined versions. HCS v2.0.12 showed the poorest performance and provided artificially lower CpG methylation levels when 5-methylcytosine is read as guanine (first read of PBAT-seq and second read of MethylC-seq). In addition, paired-end sequencing of low diversity libraries using HCS v2.2.38 or the latest HCS v2.2.58 was greatly affected by cluster densities.Software updates in the Illumina HiSeq platform can affect the outputs from low-diversity sequencing libraries such as WGBS libraries. More recent versions are not necessarily the better, and HCS v2.0.5 is currently the best for WGBS among the examined HCS versions. Thus, together with other experimental conditions, special care has to be taken on this point when CpG methylation levels are to be compared between different samples by WGBS.
Project description:DNA methylation is a major epigenetic modification regulating several biological processes. A standard approach to measure DNA methylation is bisulfite sequencing (BS-Seq). BS-Seq couples bisulfite conversion of DNA with next-generation sequencing to profile genome-wide DNA methylation at single base resolution. The analysis of BS-Seq data involves the use of customized aligners for mapping bisulfite converted reads and the bioinformatic pipelines for downstream data analysis.Here we developed MethGo, a software tool designed for the analysis of data from whole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS). MethGo provides both genomic and epigenomic analyses including: 1) coverage distribution of each cytosine; 2) global cytosine methylation level; 3) cytosine methylation level distribution; 4) cytosine methylation level of genomic elements; 5) chromosome-wide cytosine methylation level distribution; 6) Gene-centric cytosine methylation level; 7) cytosine methylation levels at transcription factor binding sites (TFBSs); 8) single nucleotide polymorphism (SNP) calling, and 9) copy number variation (CNV) calling.MethGo is a simple and effective tool for the analysis of BS-Seq data including both WGBS and RRBS. It contains 9 analyses in 5 major modules to profile (epi)genome. It profiles genome-wide DNA methylation in global and in gene level scale. It can also analyze the methylation pattern around the transcription factor binding sites, and assess genetic variations such as SNPs and CNVs. MethGo is coded in Python and is publically available at http://paoyangchen-laboratory.github.io/methgo/.
Project description:In eukaryotic genomes, DNA methylation is an important type of epigenetic modification that plays crucial roles in many biological processes. To investigate the impact of a hypovirus infection on the methylome of Cryphonectria parasitica, the chestnut blight fungus, whole-genome bisulfite sequencing (WGBS) was employed to generate single-base resolution methylomes of the fungus with/without hypovirus infection. The results showed that hypovirus infection alters methylation in all three contexts (CG, CHG, and CHH), especially in gene promoters. A total of 600 differentially methylated regions (DMRs) were identified, of which 144 could be annotated to functional genes. RNA-seq analysis revealed that DNA methylation in promoter is negatively correlated with gene expression. Among DMRs, four genes were shown to be involved in conidiation, orange pigment production, and virulence. Taken together, our DNA methylomes analysis provide valuable insights into the understanding of the relationship between DNA methylation and hypovirus infection, as well as phenotypic traits in C. parasitica.