Unknown

Dataset Information

0

Accurate assembly of multi-end RNA-seq data with Scallop2.


ABSTRACT: Modern RNA-sequencing protocols can produce multi-end data, where multiple reads originating from the same transcript are attached to the same barcode. The long-range information in the multi-end reads is beneficial in phasing complicated spliced isoforms, but assembly algorithms that leverage such information are lacking. Here we introduce Scallop2, a reference-based assembler optimized for multi-end RNA-seq data. The algorithmic core of Scallop2 consists of three steps: (1) using an algorithm to "bridge" multi-end reads into single-end phasing paths in the context of a splice graph, (2) employing a method to refine erroneous splice graphs by utilizing multi-end reads that fail to bridge, and (3) piping the refined splice graph and the bridged phasing paths into an algorithm that integrates multiple phase-preserving decompositions. Tested on 561 cells in two Smart-seq3 datasets and on 10 Illumina paired-end RNA-seq samples, Scallop2 substantially improves the assembly accuracy compared to two popular assemblers StringTie2 and Scallop.

SUBMITTER: Zhang Q 

PROVIDER: S-EPMC9879047 | biostudies-literature | 2022 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate assembly of multi-end RNA-seq data with Scallop2.

Zhang Qimin Q   Shi Qian Q   Shao Mingfu M  

Nature computational science 20220328 3


Modern RNA-sequencing protocols can produce multi-end data, where multiple reads originating from the same transcript are attached to the same barcode. The long-range information in the multi-end reads is beneficial in phasing complicated spliced isoforms, but assembly algorithms that leverage such information are lacking. Here we introduce Scallop2, a reference-based assembler optimized for multi-end RNA-seq data. The algorithmic core of Scallop2 consists of three steps: (1) using an algorithm  ...[more]

Similar Datasets

| S-EPMC5144000 | biostudies-literature
| S-EPMC4331715 | biostudies-literature
| S-EPMC2916723 | biostudies-literature
| S-EPMC11748423 | biostudies-literature
| S-EPMC6596894 | biostudies-literature
| S-EPMC3025570 | biostudies-literature
| S-EPMC4380033 | biostudies-literature
| S-EPMC9754601 | biostudies-literature
| S-EPMC4673974 | biostudies-literature
| S-EPMC6614939 | biostudies-literature