Dataset Information


A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information.

ABSTRACT: Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments.


PROVIDER: S-EPMC3326300 | BioStudies | 2012-01-01

REPOSITORIES: biostudies

Similar Datasets

2020-01-01 | S-EPMC6964213 | BioStudies
1000-01-01 | S-EPMC6199061 | BioStudies
2006-01-01 | S-EPMC2268903 | BioStudies
2018-01-01 | S-EPMC5991515 | BioStudies
2019-01-01 | S-EPMC6754165 | BioStudies
2007-01-01 | S-EPMC2268896 | BioStudies
2012-01-01 | S-EPMC3465233 | BioStudies
2019-01-01 | S-EPMC6608620 | BioStudies
2014-01-01 | S-EPMC3946423 | BioStudies
2009-01-01 | S-EPMC2688469 | BioStudies