Unknown

Dataset Information

0

NETWORK MODELLING OF TOPOLOGICAL DOMAINS USING HI-C DATA.


ABSTRACT: Chromosome conformation capture experiments such as Hi-C are used to map the three-dimensional spatial organization of genomes. One specific feature of the 3D organization is known as topologically associating domains (TADs), which are densely interacting, contiguous chromatin regions playing important roles in regulating gene expression. A few algorithms have been proposed to detect TADs. In particular, the structure of Hi-C data naturally inspires application of community detection methods. However, one of the drawbacks of community detection is that most methods take exchangeability of the nodes in the network for granted; whereas the nodes in this case, that is, the positions on the chromosomes, are not exchangeable. We propose a network model for detecting TADs using Hi-C data that takes into account this nonexchangeability. in addition, our model explicitly makes use of cell-type specific CTCF binding sites as biological covariates and can be used to identify conserved TADs across multiple cell types. The model leads to a likelihood objective that can be efficiently optimized via relaxation. We also prove that when suitably initialized, this model finds the underlying TAD structure with high probability. using simulated data, we show the advantages of our method and the caveats of popular community detection methods, such as spectral clustering, in this application. Applying our method to real Hi-C data, we demonstrate the domains identified have desirable epigenetic features and compare them across different cell types.

SUBMITTER: Wang YXR 

PROVIDER: S-EPMC7508461 | biostudies-literature | 2019 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

NETWORK MODELLING OF TOPOLOGICAL DOMAINS USING HI-C DATA.

Wang Y X Rachel YXR   Sarkar Purnamrita P   Ursu Oana O   Kundaje Anshul A   Bickel Peter J PJ  

The annals of applied statistics 20190901 3


Chromosome conformation capture experiments such as Hi-C are used to map the three-dimensional spatial organization of genomes. One specific feature of the 3D organization is known as topologically associating domains (TADs), which are densely interacting, contiguous chromatin regions playing important roles in regulating gene expression. A few algorithms have been proposed to detect TADs. In particular, the structure of Hi-C data naturally inspires application of community detection methods. Ho  ...[more]

Similar Datasets

2021-11-16 | GSE188753 | GEO
2018-05-30 | GSE115062 | GEO
| S-EPMC9270730 | biostudies-literature
| S-EPMC5821732 | biostudies-literature
| S-EPMC7055922 | biostudies-literature
2015-10-06 | GSE55743 | GEO
2020-01-01 | GSE125640 | GEO