Unknown

Dataset Information

0

Accurate prediction of functional states of cis-regulatory modules reveals common epigenetic rules in humans and mice.


ABSTRACT:

Background

Predicting cis-regulatory modules (CRMs) in a genome and their functional states in various cell/tissue types of the organism are two related challenging computational tasks. Most current methods attempt to simultaneously achieve both using data of multiple epigenetic marks in a cell/tissue type. Though conceptually attractive, they suffer high false discovery rates and limited applications. To fill the gaps, we proposed a two-step strategy to first predict a map of CRMs in the genome, and then predict functional states of all the CRMs in various cell/tissue types of the organism. We have recently developed an algorithm for the first step that was able to more accurately and completely predict CRMs in a genome than existing methods by integrating numerous transcription factor ChIP-seq datasets in the organism. Here, we presented machine-learning methods for the second step.

Results

We showed that functional states in a cell/tissue type of all the CRMs in the genome could be accurately predicted using data of only 1~4 epigenetic marks by a variety of machine-learning classifiers. Our predictions are substantially more accurate than the best achieved so far. Interestingly, a model trained on a cell/tissue type in humans can accurately predict functional states of CRMs in different cell/tissue types of humans as well as of mice, and vice versa. Therefore, epigenetic code that defines functional states of CRMs in various cell/tissue types is universal at least in humans and mice. Moreover, we found that from tens to hundreds of thousands of CRMs were active in a human and mouse cell/tissue type, and up to 99.98% of them were reutilized in different cell/tissue types, while as small as 0.02% of them were unique to a cell/tissue type that might define the cell/tissue type.

Conclusions

Our two-step approach can accurately predict functional states in any cell/tissue type of all the CRMs in the genome using data of only 1~4 epigenetic marks. Our approach is also more cost-effective than existing methods that typically use data of more epigenetic marks. Our results suggest common epigenetic rules for defining functional states of CRMs in various cell/tissue types in humans and mice.

SUBMITTER: Ni P 

PROVIDER: S-EPMC9535988 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate prediction of functional states of cis-regulatory modules reveals common epigenetic rules in humans and mice.

Ni Pengyu P   Moe Joshua J   Su Zhengchang Z  

BMC biology 20221005 1


<h4>Background</h4>Predicting cis-regulatory modules (CRMs) in a genome and their functional states in various cell/tissue types of the organism are two related challenging computational tasks. Most current methods attempt to simultaneously achieve both using data of multiple epigenetic marks in a cell/tissue type. Though conceptually attractive, they suffer high false discovery rates and limited applications. To fill the gaps, we proposed a two-step strategy to first predict a map of CRMs in th  ...[more]

Similar Datasets

| S-EPMC8210889 | biostudies-literature
| S-EPMC4290730 | biostudies-literature
| S-EPMC2230503 | biostudies-literature
| S-EPMC1796902 | biostudies-literature
| S-EPMC3424583 | biostudies-literature
| S-EPMC2394749 | biostudies-literature
| S-EPMC521067 | biostudies-literature
| S-EPMC2657044 | biostudies-literature
| S-EPMC6834773 | biostudies-literature
| S-EPMC3235160 | biostudies-literature