Unknown

Dataset Information

0

Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.


ABSTRACT: We propose a novel method and software tool, Strawberry, for transcript reconstruction and quantification from RNA-Seq data under the guidance of genome alignment and independent of gene annotation. Strawberry consists of two modules: assembly and quantification. The novelty of Strawberry is that the two modules use different optimization frameworks but utilize the same data graph structure, which allows a highly efficient, expandable and accurate algorithm for dealing large data. The assembly module parses aligned reads into splicing graphs, and uses network flow algorithms to select the most likely transcripts. The quantification module uses a latent class model to assign read counts from the nodes of splicing graphs to transcripts. Strawberry simultaneously estimates the transcript abundances and corrects for sequencing bias through an EM algorithm. Based on simulations, Strawberry outperforms Cufflinks and StringTie in terms of both assembly and quantification accuracies. Under the evaluation of a real data set, the estimated transcript expression by Strawberry has the highest correlation with Nanostring probe counts, an independent experiment measure for transcript expression.Strawberry is written in C++14, and is available as open source software at https://github.com/ruolin/strawberry under the MIT license.

SUBMITTER: Liu R 

PROVIDER: S-EPMC5720828 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.

Liu Ruolin R   Dickerson Julie J  

PLoS computational biology 20171127 11


We propose a novel method and software tool, Strawberry, for transcript reconstruction and quantification from RNA-Seq data under the guidance of genome alignment and independent of gene annotation. Strawberry consists of two modules: assembly and quantification. The novelty of Strawberry is that the two modules use different optimization frameworks but utilize the same data graph structure, which allows a highly efficient, expandable and accurate algorithm for dealing large data. The assembly m  ...[more]

Similar Datasets

| S-EPMC3163565 | biostudies-literature
| S-EPMC4673974 | biostudies-literature
| S-EPMC4881296 | biostudies-literature
| S-EPMC4071332 | biostudies-literature
| S-EPMC3851240 | biostudies-literature
| S-EPMC4411664 | biostudies-literature
| S-EPMC6101504 | biostudies-literature
| S-EPMC5144000 | biostudies-literature
| S-EPMC3789545 | biostudies-other
| S-EPMC4971760 | biostudies-literature