Unknown

Dataset Information

0

ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data.


ABSTRACT: Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.

SUBMITTER: Gao Y 

PROVIDER: S-EPMC9858503 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data.

Gao Yuan Y   Wang Feng F   Wang Robert R   Kutschera Eric E   Xu Yang Y   Xie Stephan S   Wang Yuanyuan Y   Kadash-Edmondson Kathryn E KE   Lin Lan L   Xing Yi Y  

Science advances 20230120 3


Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improv  ...[more]

Similar Datasets

2022-10-14 | GSE192955 | GEO
| PRJNA793881 | ENA
| S-EPMC10448944 | biostudies-literature
| S-EPMC10402094 | biostudies-literature
| S-EPMC11543605 | biostudies-literature
| S-EPMC7355257 | biostudies-literature
2023-11-27 | GSE248115 | GEO
2023-11-27 | GSE248114 | GEO
2023-11-27 | GSE248118 | GEO
| S-EPMC7688782 | biostudies-literature