Genomics

Dataset Information

299

Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data


ABSTRACT: Accurate functional annotation of regulatory elements is essential for understanding global gene regulation. Here, we report a genome-wide map of 827,000 transcription factor binding sites in human lymphoblastoid cell lines, which is comprised of sites correspond-ing to 239 position weight matrices of known transcription factor binding motifs, and 49 novel sequence motifs. To generate this map, we developed a probabilistic framework that integrates cell- or tissue-specific experimental data such as histone modifications and DNa-seI cleavage patterns with genomic information such as gene annotation and evolutionary conservation. Comparison to empirical ChIP-seq data suggests that our method is highly accurate yet has the advantage of targeting many factors in a single assay. We anticipate that this approach will be a valuable tool for genome-wide studies of gene regulation in a wide variety of cell-types or tissues under diverse conditions. DNaseI-Seq on two YRI Hapmap cell lines. Each individual sequenced on 8 lanes of the Illumina Genome Analyzer II

ORGANISM(S): Homo sapiens  

SUBMITTER: Yoav Gilad   Athma A Pai  Jonathan K Pritchard  Daniel Gaffney  Roger Pique-Regi  Jacob F Degner 

PROVIDER: E-GEOD-25341 | ArrayExpress | 2010-11-15

SECONDARY ACCESSION(S): GSE25341SRP004446PRJNA134875

REPOSITORIES: GEO, ArrayExpress, ENA

altmetric image

Publications

Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data.

Pique-Regi Roger R   Degner Jacob F JF   Pai Athma A AA   Gaffney Daniel J DJ   Gilad Yoav Y   Pritchard Jonathan K JK  

Genome research 20101124 3


Accurate functional annotation of regulatory elements is essential for understanding global gene regulation. Here, we report a genome-wide map of 827,000 transcription factor binding sites in human lymphoblastoid cell lines, which is comprised of sites corresponding to 239 position weight matrices of known transcription factor binding motifs, and 49 novel sequence motifs. To generate this map, we developed a probabilistic framework that integrates cell- or tissue-specific experimental data such  ...[more]

Similar Datasets

2010-11-15 | GSE25341 | GEO
2014-09-24 | E-GEOD-51726 | ArrayExpress
2015-07-03 | E-GEOD-64864 | ArrayExpress
2015-07-03 | E-GEOD-64862 | ArrayExpress
2010-08-17 | E-GEOD-19063 | ArrayExpress
2015-11-25 | E-GEOD-61844 | ArrayExpress
2012-12-17 | E-GEOD-42398 | ArrayExpress
2015-05-18 | E-GEOD-55046 | ArrayExpress
2014-08-01 | E-GEOD-53348 | ArrayExpress
2014-01-15 | E-GEOD-51022 | ArrayExpress