Genomics

Dataset Information

3

A Random-Forest Based Algorithm for Prediction of Enhancers From Histone Modifications


ABSTRACT: Transcriptional enhancers play critical roles in regulation of gene expression, but their identification has remained a challenge. Recently, it was shown that enhancers in the mammalian genome are associated with characteristic histone modification patterns, which have been increasingly exploited for enhancer identification. However, only a limited number of histone modifications have previously been investigated for this purpose, leaving the questions answered whether there exist an optimal set of histone modifications that could improve the enhancer prediction. Here, we address this issue by exploring a rich dataset produced by the human Epigenome Roadmap Project. Specifically, we examined genome-wide profiles of 24 histone modifications in human embryonic stem cells and fibroblasts, and developed a Random-Forest based algorithm to integrate histone modification profiles for identification of enhancers.As a training set, we used histone modification profiles at genome-wide binding sites of p300 in the two cell types identified using ChIP-seq. We show that this algorithm not only leads to more accurate and precise prediction of enhancers than previous methods, but also helps identify an optimal set of three chromatin marks for enhancer prediction. ChIP-Seq Analysis of p300 in hESC H1 and IMR90 cells. Sequencing was done on the Illumina Genome Analyzer II platform for the H1 data and Illumina HiSeq for IMR90.Data was mapped to hg18 using Bowtie.

ORGANISM(S): Homo Sapiens

SUBMITTER: Wei Xie   Lee Edsall  Jason Ernst  Fulai Jin  John Stamatoyannopoulos  Manolis Kellis  Samantha Kuan  Bing Ren  Nisha Rajagopal  David Hawkins  Sarit Klugman  Yan Li  Audrey Kim 

PROVIDER: E-GEOD-37858 | ArrayExpress | 2012-05-09

SECONDARY ACCESSION(S): GSE37858SRP012970PRJNA165163

REPOSITORIES: GEO, ArrayExpress, ENA

Dataset's files

Source:
Action DRS
E-GEOD-37858.README.txt Txt
E-GEOD-37858.idf.txt Idf
E-GEOD-37858.processed.1.zip Processed
E-GEOD-37858.processed.2.zip Processed
E-GEOD-37858.sdrf.txt Txt
Items per page:
1 - 5 of 5

Similar Datasets

| E-GEOD-37858 | BioStudies
| S-EPMC5035071 | BioStudies
| S-EPMC5144062 | BioStudies
| S-EPMC4977478 | BioStudies
| S-EPMC3597546 | BioStudies
2013-01-01 | S-EPMC3905895 | BioStudies
2009-10-13 | E-GEOD-17917 | ArrayExpress
2009-10-14 | E-GEOD-18292 | ArrayExpress
| S-EPMC3406964 | BioStudies
2013-10-20 | E-GEOD-43070 | ArrayExpress