Unknown

Dataset Information

0

Estimation of CpG coverage in whole methylome next-generation sequencing studies.


ABSTRACT: BACKGROUND:Methylation studies are a promising complement to genetic studies of DNA sequence. However, detailed prior biological knowledge is typically lacking, so methylome-wide association studies (MWAS) will be critical to detect disease relevant sites. A cost-effective approach involves the next-generation sequencing (NGS) of single-end libraries created from samples that are enriched for methylated DNA fragments. A limitation of single-end libraries is that the fragment size distribution is not observed. This hampers several aspects of the data analysis such as the calculation of enrichment measures that are based on the number of fragments covering the CpGs. RESULTS:We developed a non-parametric method that uses isolated CpGs to estimate sample-specific fragment size distributions from the empirical sequencing data. Through simulations we show that our method is highly accurate. While the traditional (extended) read count methods resulted in severely biased coverage estimates and introduces artificial inter-individual differences, through the use of the estimated fragment size distributions we could remove these biases almost entirely. Furthermore, we found correlations of 0.999 between coverage estimates obtained using fragment size distributions that were estimated with our method versus those that were "observed" in paired-end sequencing data. CONCLUSIONS:We propose a non-parametric method for estimating fragment size distributions that is highly precise and can improve the analysis of cost-effective MWAS studies that sequence single-end libraries created from samples that are enriched for methylated DNA fragments.

SUBMITTER: van den Oord EJ 

PROVIDER: S-EPMC3599116 | biostudies-literature | 2013 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Estimation of CpG coverage in whole methylome next-generation sequencing studies.

van den Oord Edwin J C G EJ   Bukszar Jozsef J   Rudolf Gábor G   Nerella Srilaxmi S   McClay Joseph L JL   Xie Lin Y LY   Aberg Karolina A KA  

BMC bioinformatics 20130212


<h4>Background</h4>Methylation studies are a promising complement to genetic studies of DNA sequence. However, detailed prior biological knowledge is typically lacking, so methylome-wide association studies (MWAS) will be critical to detect disease relevant sites. A cost-effective approach involves the next-generation sequencing (NGS) of single-end libraries created from samples that are enriched for methylated DNA fragments. A limitation of single-end libraries is that the fragment size distrib  ...[more]

Similar Datasets

2017-04-03 | PXD003804 | Pride
| S-EPMC8224930 | biostudies-literature
| S-EPMC7175505 | biostudies-literature
| S-EPMC4400156 | biostudies-literature
| S-EPMC3212839 | biostudies-other
| S-EPMC3148210 | biostudies-literature
| S-EPMC3370281 | biostudies-literature
| S-EPMC4682413 | biostudies-literature