Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Machine learning for discovery: deciphering RNA splicing logic

ABSTRACT: Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design. However, current state-of-the-art neural networks are limited by their uninterpretability: despite providing accurate predictions, they cannot describe how they arrived at their predictions. Here, using an ``interpretable-by-design'' approach, we present a neural network model that provides insights into RNA splicing, a fundamental process in the transfer of genomic information into functional biochemical products. Although we designed our model to emphasize interpretability, its predictive accuracy is on par with state-of-the-art models. To demonstrate the model's interpretability, we introduce a visualization that, for any given exon, allows us to trace and quantify the entire decision process from input sequence to output splicing prediction. Importantly, the model revealed novel components of the splicing logic, which we experimentally validated. This study highlights how interpretable machine learning can advance scientific discovery.

ORGANISM(S): Homo sapiens

PROVIDER: GSE200096 | GEO | 2022/10/01

REPOSITORIES: GEO

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Similar Datasets

Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis

Project description:The incorporation of machine learning methods into proteomics workflows improves the identification of disease-relevant biomarkers and biological pathways. However, machine learning models, such as deep neural networks, typically suffer from lack of interpretability. Here, we present a deep learning approach to combine biological pathway analysis and biomarker identification to increase the interpretability of proteomics experiments. Our approach integrates a priori knowledge of the relationships between proteins and biological pathways and biological processes into sparse neural networks to create biologically informed neural networks. We employ these networks to differentiate between clinical subphenotypes of septic acute kidney injury and COVID-19, as well as acute respiratory distress syndrome of different aetiologies. To gain biological insight into the complex syndromes, we utilize feature attribution-methods to introspect the networks for the identification of proteins and pathways important for distinguishing between subtypes. The algorithms are implemented in a freely available open source Python-package (https://github.com/InfectionMedicineProteomics/BINN).

2023-09-05 | PXD044264 | Pride

A scalable derivative-free optimizer trained through learning to optimize approach to interpret drug response mechanism network

Project description:With the emergence of various drug-based treatment strategies, many drug response prediction models have been developed to understand their effects. However, in order to gain comprehensive understanding of a drug response, a prediction model should reflect the underlying biological mechanisms, but the current models suffer from interpretability and scalability problems. Machine learning-based prediction models base their predictions on inferred features, which usually are not well correlated with biological mechanisms, posing a challenge on interpretability of its response predictions. In this regard, using Boolean modeling schemes may allow interpretations on mechanisms that contribute to a particular response, but optimizing Boolean models is difficult because of their high dimensional search space and discontinuous loss function. Here, we developed a scalable derivative-free optimizer for weighted sum Boolean network through meta-reinforcement learning. By using graph network and coordinate-wise policy, our learned optimizer can optimize high dimensional Boolean networks containing over 100 parameters of arbitrary structure, showing higher sample efficiency compared with other meta-heuristic algorithms. The optimized Boolean networks successfully predict the drug responses congruent with public databases and in-house experimental data. Moreover, mechanistic analysis of optimized networks shows reliable interpretability of the predictions by meaningful suggestions of known basket trial drug response prediction markers.

2024-04-27 | GSE184731 | GEO

Project description:Machine Learning based antimicrobial peptide design

| PRJNA1243451 | ENA

A Multi-Omics Interpretable Machine Learning Model Reveals Modes of Action of Small Molecules (RNA-Seq)

Project description:High-throughput screening and gene signature analyses frequently identify lead therapeutic compounds with unknown modes of action (MoAs), and the resulting uncertainties can lead to the failure of clinical trials. We developed a multi-omics approach for uncovering MoAs through an interpretable machine learning model of the effects of compounds on transcriptomic, epigenomic, metabolomic, and proteomic data. We applied this approach to examine compounds with beneficial effects in models of Huntington’s disease, finding common MoAs for previously unrelated compounds that were not predicted based on similarities in the compounds’ structures, connectivity scores, or binding targets. We experimentally validated two such disease-relevant MoAs, autophagy activation and bioenergetics manipulation. This interpretable machine learning approach can be used to find and evaluate MoAs in future drug development efforts.

2020-01-22 | GSE129143 | GEO

A Multi-Omics Interpretable Machine Learning Model Reveals Modes of Action of Small Molecules (ChIP-Seq)

2020-01-22 | GSE129141 | GEO

Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning

Project description:Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning

| PRJNA956958 | ENA

In silico nano-dissection: defining cell type specificity at transcriptional level in human disease (tubulointerstitium)

Project description:To identify genes with cell-lineage-specific expression not accessible by experimental micro-dissection, we developed a genome-scale iterative method, in-silico nano-dissection, which leverages high-throughput functional-genomics data from tissue homogenates using a machine-learning framework. This study applied nano-dissection to chronic kidney disease and identified transcripts specific to podocytes, key cells in the glomerular filter responsible for hereditary proteinuric syndromes and acquired CKD. In-silico prediction accuracy exceeded predictions derived from fluorescence-tagged-murine podocytes, identified genes recently implicated in hereditary glomerular disease and predicted genes significantly correlated with kidney function. The nano-dissection method is broadly applicable to define lineage specificity in many functional and disease contexts. We applied a machine-learning framework on high-throughput gene expression data from human kidney biopsy tissue homogenates and predict novel podocyte-specific genes. The prediction was validated by Human Protein Atlas at protein level. Prediction accuracy was compared with predictions derived from experimental approach using fluorescence-tagged-murine podocytes.

2013-08-06 | E-GEOD-47184 | biostudies-arrayexpress

In silico nano-dissection: defining cell type specificity at transcriptional level in human disease (glomeruli)

2013-08-06 | E-GEOD-47183 | biostudies-arrayexpress

Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning I

Project description:Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning I

| PRJNA956962 | ENA

Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning III

Project description:Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning III

| PRJNA956965 | ENA