Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:The information about when and where each gene is to be expressed is mainly encoded in the DNA sequence of enhancers, sequence elements that comprise binding sites (motifs) for different transcription factors (TFs). Most of the research on enhancer sequences has been focused on TF motif presence, while the enhancer syntax, i.e. the flexibility of important motif positions and how the sequence context modulates the activity of TF motifs, remain poorly understood. Here, we explore the rules of enhancer syntax by a two-pronged approach in Drosophila melanogaster S2 cells: we (1) replace important motifs by an exhaustive set of all possible 65,536 eight-nucleotide-long random sequences and (2) paste eight important TF motif types into 763 motif positions within 496 enhancers. These complementary strategies reveal that enhancers display constrained sequence flexibility and the context-specific modulation of motif function. Important motifs can be functionally replaced by hundreds of sequences constituting several distinct motif types, but only a fraction of all possible sequences and motif types restore enhancer activity. Moreover, TF motifs contribute with different intrinsic strengths that are strongly modulated by the enhancer sequence context (the flanking sequence, presence and diversity of other motif types, and distance between motifs), such that not all motif types can work in all positions. Constrained sequence flexibility and the context-specific modulation of motif function are also hallmarks of human enhancers and TF motifs, as we demonstrate experimentally. Overall, these two general principles of enhancer sequences are important to understand and predict enhancer function during development, evolution and in disease.