Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Benchmarking of modification-aware basecalling models

ABSTRACT: Benchmarking of modification-aware basecalling models

PROVIDER: PRJEB91026 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Json Xml

Similar Datasets

Enhanced detection of RNA modifications with high-accuracy nanopore RNA basecalling models

Project description:Chemical RNA modifications, collectively referred to as the ‘epitranscriptome’, have been intensively studied during the last years, largely facilitated by the use of next-generation sequencing technologies. Recent efforts have turned towards the nanopore direct RNA sequencing (DRS) platform, as it allows simultaneous detection of diverse RNA modification types in full-length native RNA molecules. While RNA modifications can be identified in the form of systematic basecalling ‘errors’ in DRS datasets, m6A modifications produce very modest ‘errors’, limiting the applicability of this approach to sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully-unmodified in vitro synthetic sequences, increase the ‘error’ signal of m6A modifications, leading to enhanced detection of RNA modifications even at lower stoichiometries. We then show that the use of these models enhances the detection of RNA modifications on previously published in vivo human samples, using third-party softwares for the detection of RNA modifications. Moreover, our work provides a novel RNA basecalling model that shows a median accuracy of 97%, compared to previously available RNA basecalling models that show 91% accuracy. Notably, this increase in accuracy does not only lead to improved detection of RNA modifications, but also enhanced mappability of RNA reads, which becomes more evident in the case of short RNA reads (50% increase). Altogether, our work stresses the importance of using fully unmodified RNA sequences for training RNA basecalling models, and how the use of different basecalling models can significantly affect the detection of RNA modifications and read mappability.

2024-08-26 | GSE246151 | GEO

Benchmarking AAV-Cre mouse models for RNA-seq

Project description:Benchmarking AAV-Cre mouse models for RNA-seq

| PRJNA695279 | ENA

Enhanced detection of RNA modifications with high-accuracy nanopore RNA basecalling models

Project description:Detection of RNA modifications from dRNA-seq data using enhanced basecalling models

| PRJEB67632 | ENA

Enhanced detection of RNA modifications with high-accuracy nanopore RNA basecalling models

Project description:Enhanced detection of RNA modifications with high-accuracy nanopore RNA basecalling models

| PRJNA1031667 | ENA

Improved Prediction of Smoking Status via Isoform-Aware RNA-seq Deep Learning Models

Project description:Improved Prediction of Smoking Status via Isoform-Aware RNA-seq Deep Learning Models

| PRJNA666220 | ENA

Adapting Nanopore Sequencing Basecalling Models for Modification Detection via Incremental Learning and Anomaly Detection.

Project description:We leverage machine learning approaches to adapt nanopore sequencing basecallers for nucleotide modification detection. We first apply the incremental learning technique to improve the basecalling of modification-rich sequences, which are usually of high biological interests. With sequence backbones resolved, we further run anomaly detection on individual nucleotides to determine their modification status. By this means, our pipeline promises the single-molecule, single-nucleotide and sequence context-free detection of modifications. We benchmark the pipeline using control oligos, further apply it in the basecalling of densely-modified yeast tRNAs and E.coli genomic DNAs, the cross-species detection of N6-methyladenosine (m6A) in mammalian mRNAs, and the simultaneous detection of N1-methyladenosine (m1A) and m6A in human mRNAs. Our IL-AD workflow is available at: https://github.com/wangziyuan66/IL-AD.

| S-EPMC10769248 | biostudies-literature

Adapting nanopore sequencing basecalling models for modification detection via incremental learning and anomaly detection.

Project description:We leverage machine learning approaches to adapt nanopore sequencing basecallers for nucleotide modification detection. We first apply the incremental learning (IL) technique to improve the basecalling of modification-rich sequences, which are usually of high biological interest. With sequence backbones resolved, we further run anomaly detection (AD) on individual nucleotides to determine their modification status. By this means, our pipeline promises the single-molecule, single-nucleotide, and sequence context-free detection of modifications. We benchmark the pipeline using control oligos, further apply it in the basecalling of densely-modified yeast tRNAs and E.coli genomic DNAs, the cross-species detection of N6-methyladenosine (m6A) in mammalian mRNAs, and the simultaneous detection of N1-methyladenosine (m1A) and m6A in human mRNAs. Our IL-AD workflow is available at: https://github.com/wangziyuan66/IL-AD .

| S-EPMC11339354 | biostudies-literature

ONT benchmarking paper

Project description:ONT benchmarking paper

| PRJEB93892 | ENA

iiCLIP protocol benchmarking with PTBP1

Project description:We describe an improved individual nucleotide resolution CLIP protocol (iiCLIP), which can be completed within 4 days from UV crosslinking to libraries for sequencing. For benchmarking, we directly compared PTBP1 iiCLIP libraries with the iCLIP2 protocol produced under standardised conditions with 1 million HEK293 cells, and with public eCLIP and iCLIP PTBP1 data. There are 3 PTBP1 iiCLIP libraries, 1 input iiCLIP library and 1 PTBP1 iCLIP2 library produced in this study.

2022-08-01 | E-MTAB-10881 | biostudies-arrayexpress

Benchmarking brain organoid recapitulation of fetal corticogenesis

Project description:Brain organoids (BO) enabled the investigation of human corticogenesis in-vitro with an increasing range of protocols achieving its remarkable recapitulation. However, we lack a resource gathering fetal cortex-specific gene co-expression patterns and their behavior in BO. We complement the current knowledge with a benchmarking of BO versus human corticogenesis, integrating: transcriptomes from in-house differentiated cortical BO (CBO), in-house processed human fetal brain samples, analysis of transcriptomes from different BO systems and of pre-natal cortical samples from the BrainSpan Atlas.

2022-12-31 | E-MTAB-11239 | biostudies-arrayexpress

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data