Transcriptomics

Dataset Information

0

Merging short and stranded long reads improves transcript assembly


ABSTRACT: New tools for improved long-read transcript assembly and coalescence with its short-read counterpart are required. Using our short- and long-read measurements from different cell lines with spiked-in standards, we systematically compared key parameters and biases in the read alignment and assembly of transcripts. We report a cDNA synthesis artifact in long-read datasets that impacts the identity and quantitation of assembled transcripts. We developed a computational pipeline to strand long-read cDNA libraries that markedly improves assembly of transcripts from long-reads. Incorporating stranded long-reads in a new hybrid assembly approach, we demonstrate its efficacy for improved characterization of challenging lncRNA transcripts. Our workflow can be applied to a wide range of transcriptomics datasets for superior demarcation of transcript ends and refined isoform structure, which can enable better differential gene expression analyses and molecular manipulations of transcripts.

ORGANISM(S): Mus musculus

PROVIDER: GSE215357 | GEO | 2023/10/14

REPOSITORIES: GEO

Similar Datasets

2023-10-14 | GSE215355 | GEO
2022-08-30 | PXD034464 | Pride
2022-05-31 | PXD033870 | Pride
2020-01-23 | PXD014844 | Pride
2022-01-08 | GSE189482 | GEO
2023-09-18 | GSE212573 | GEO
2023-09-18 | GSE212572 | GEO
2023-09-18 | GSE212570 | GEO
2023-09-18 | GSE212569 | GEO
2023-09-18 | GSE212571 | GEO