Project description:In order to generate data suitable to decipher cis-regulatory logic, we generated ~100 million synthetic promoters (in yeast) comprised of random DNA and measured their expression by FACS (sorting into 18 bins).
Project description:Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design. However, current state-of-the-art neural networks are limited by their uninterpretability: despite providing accurate predictions, they cannot describe how they arrived at their predictions. Here, using an ``interpretable-by-design'' approach, we present a neural network model that provides insights into RNA splicing, a fundamental process in the transfer of genomic information into functional biochemical products. Although we designed our model to emphasize interpretability, its predictive accuracy is on par with state-of-the-art models. To demonstrate the model's interpretability, we introduce a visualization that, for any given exon, allows us to trace and quantify the entire decision process from input sequence to output splicing prediction. Importantly, the model revealed novel components of the splicing logic, which we experimentally validated. This study highlights how interpretable machine learning can advance scientific discovery.
Project description:Pulsed SILAC approaches allow measurement of protein dynamics, including protein translation and degradation. However, its use in quantifying acute changes has been limited due the low labeled peptide stoichiometry. Here, we describe the use of instrument logic to select peaks of interest via targeted mass differences (TMD) for overcoming this limitation. Comparing peptides artificially mixed at low heavy-to-light stoichiometry measured using standard data dependent acquisition with or without TMD revealed 2-3 fold increases in identification without significant loss in quantification precision for both MS2 and MS3 methods. Our benchmarked method approach increases throughput by reducing the necessary machine time. We anticipate that all pulsed SILAC measurements, if combined with TMT or not, would greatly benefit from instrument logic based approaches.
Project description:Core regularity transcription factors (CR TFs) define cell identity and lineage through an exquisitely precise and logical order during embryogenesis and development. These CR TFs regulated one another in three-dimensional space via distal enhancers that serve as logic gates embedded in their TF recognition sequences. Aberrant chromatin organization resulting in miswired circuitry of enhancer logic is a newly recognized feature in many cancers. Here, we report that PAX3-FOXO1 expression is driven by a translocated FOXO1 distal super enhancer (SE). ChIP-seq in tumors bearing rare PAX translocations implicate enhancer miswiring is a pervasive feature across all FP-RMS tumors. Therefore, our data reveal a mechanism of a translocated hijacked enhancer which disrupts the normal CR TF logic during skeletal muscle development (PAX3 to MYOD to MYOG), replacing it with an infinite loop logic that makes rhabdomyosarcoma cells unable to exit the undifferentiated proliferating stage.