Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:Transcription factors (TFs) recognizing DNA motifs within regulatory regions drive cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, Neurogenin-2 (NGN2) and MyoD1, which recognize ubiquitous E-box motifs yet drive distinct cell fates toward neurons and muscles, respectively. Upon induction in mouse embryonic stem cells, we monitor binding across differentiation, employing an interpretable machine learning approach that integrates preexisting DNA accessibility. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding, validated by cellular and in vitro assays. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer factors, influenced by motif variant frequencies, motif spacing, and interaction partners, which together account for subsequent lineage divergence. Transferring our methodology to other models demonstrates how a combination of opportunistic binding and context-specific chromatin-opening underpin TF specificity, driving differentiation trajectories.
Project description:High-throughput screening and gene signature analyses frequently identify lead therapeutic compounds with unknown modes of action (MoAs), and the resulting uncertainties can lead to the failure of clinical trials. We developed a multi-omics approach for uncovering MoAs through an interpretable machine learning model of the effects of compounds on transcriptomic, epigenomic, metabolomic, and proteomic data. We applied this approach to examine compounds with beneficial effects in models of Huntington’s disease, finding common MoAs for previously unrelated compounds that were not predicted based on similarities in the compounds’ structures, connectivity scores, or binding targets. We experimentally validated two such disease-relevant MoAs, autophagy activation and bioenergetics manipulation. This interpretable machine learning approach can be used to find and evaluate MoAs in future drug development efforts.
Project description:High-throughput screening and gene signature analyses frequently identify lead therapeutic compounds with unknown modes of action (MoAs), and the resulting uncertainties can lead to the failure of clinical trials. We developed a multi-omics approach for uncovering MoAs through an interpretable machine learning model of the effects of compounds on transcriptomic, epigenomic, metabolomic, and proteomic data. We applied this approach to examine compounds with beneficial effects in models of Huntington’s disease, finding common MoAs for previously unrelated compounds that were not predicted based on similarities in the compounds’ structures, connectivity scores, or binding targets. We experimentally validated two such disease-relevant MoAs, autophagy activation and bioenergetics manipulation. This interpretable machine learning approach can be used to find and evaluate MoAs in future drug development efforts.
Project description:We hypothesized that transcription factor activities found to be predictive of overall Mycobacterium tuberculosis (Mtb) fitness in a pooled culture of transcription factor induction strains could also be used to predict growth phenotypes under other growth conditions using interpretable machine learning models. To test, we measured Mtb transcriptional responses grown under hypoxic and reaeration stress without recombinant transcription factor perturbation as input data for our modeling.
Project description:Understanding constraints which shape antibiotic resistance is key for predicting and controlling drug resistance. Here, we performed high-throughput laboratory evolution of Escherichia coli. The transcriptome, resistance, and genomic profiles for the evolved strains in 48 environments were quantitatively analyzed. By analyzing the quantitative datasets through interpretable machine learning techniques, the emergence of low dimensional phenotypic states within the 192 strains was observed. Further analysis revealed the underlying biological processes responsible for the distinct states. We also report a novel constraint which leads to decelerated evolution. These findings bridge the genotypic, gene expression, and drug resistance space, and lead to a comprehensive understanding of constraints for antibiotic resistance.