Unknown

Dataset Information

0

ChIPulate: A comprehensive ChIP-seq simulation pipeline.


ABSTRACT: ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) is a high-throughput technique to identify genomic regions that are bound in vivo by a particular protein, e.g., a transcription factor (TF). Biological factors, such as chromatin state, indirect and cooperative binding, as well as experimental factors, such as antibody quality, cross-linking, and PCR biases, are known to affect the outcome of ChIP-seq experiments. However, the relative impact of these factors on inferences made from ChIP-seq data is not entirely clear. Here, via a detailed ChIP-seq simulation pipeline, ChIPulate, we assess the impact of various biological and experimental sources of variation on several outcomes of a ChIP-seq experiment, viz., the recoverability of the TF binding motif, accuracy of TF-DNA binding detection, the sensitivity of inferred TF-DNA binding strength, and number of replicates needed to confidently infer binding strength. We find that the TF motif can be recovered despite poor and non-uniform extraction and PCR amplification efficiencies. The recovery of the motif is, however, affected to a larger extent by the fraction of sites that are either cooperatively or indirectly bound. Importantly, our simulations reveal that the number of ChIP-seq replicates needed to accurately measure in vivo occupancy at high-affinity sites is larger than the recommended community standards. Our results establish statistical limits on the accuracy of inferences of protein-DNA binding from ChIP-seq and suggest that increasing the mean extraction efficiency, rather than amplification efficiency, would better improve sensitivity. The source code and instructions for running ChIPulate can be found at https://github.com/vishakad/chipulate.

SUBMITTER: Datta V 

PROVIDER: S-EPMC6445533 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

ChIPulate: A comprehensive ChIP-seq simulation pipeline.

Datta Vishaka V   Hannenhalli Sridhar S   Siddharthan Rahul R  

PLoS computational biology 20190321 3


ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) is a high-throughput technique to identify genomic regions that are bound in vivo by a particular protein, e.g., a transcription factor (TF). Biological factors, such as chromatin state, indirect and cooperative binding, as well as experimental factors, such as antibody quality, cross-linking, and PCR biases, are known to affect the outcome of ChIP-seq experiments. However, the relative impact of these factors on inferences made fro  ...[more]

Similar Datasets

| S-EPMC8769893 | biostudies-literature
| S-EPMC7525341 | biostudies-literature
| S-EPMC5389943 | biostudies-literature
| S-EPMC3031039 | biostudies-literature
| S-EPMC3201895 | biostudies-other
| S-EPMC4152589 | biostudies-literature
| S-EPMC5792058 | biostudies-literature
| S-EPMC5097360 | biostudies-literature
| S-EPMC5142015 | biostudies-literature
| S-EPMC3828144 | biostudies-literature