Unknown

Dataset Information

0

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding.


ABSTRACT: The intrinsic DNA sequence preferences and cell type-specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type-specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.

SUBMITTER: Cochran K 

PROVIDER: S-EPMC8896468 | biostudies-literature | 2022 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding.

Cochran Kelly K   Srivastava Divyanshi D   Shrikumar Avanti A   Balsubramani Akshay A   Hardison Ross C RC   Kundaje Anshul A   Mahony Shaun S  

Genome research 20220118 3


The intrinsic DNA sequence preferences and cell type-specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type-specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq p  ...[more]

Similar Datasets

| S-EPMC10002701 | biostudies-literature
| S-EPMC10655966 | biostudies-literature
| S-EPMC8448872 | biostudies-literature
| S-EPMC11423150 | biostudies-literature
| S-EPMC9917285 | biostudies-literature
| S-EPMC10348819 | biostudies-literature
| S-EPMC7852092 | biostudies-literature
| S-EPMC10841018 | biostudies-literature
2022-02-21 | GSE197009 | GEO
| S-EPMC8609763 | biostudies-literature