Ontology highlight
ABSTRACT: Motivation
High-throughput RNA sequencing has revolutionized the scope and depth of transcriptome analysis. Accurate reconstruction of a phenotype-specific transcriptome is challenging due to the noise and variability of RNA-seq data. This requires computational identification of transcripts from multiple samples of the same phenotype, given the underlying consensus transcript structure.Results
We present a Bayesian method, integrated assembly of phenotype-specific transcripts (IntAPT), that identifies phenotype-specific isoforms from multiple RNA-seq profiles. IntAPT features a novel two-layer Bayesian model to capture the presence of isoforms at the group layer and to quantify the abundance of isoforms at the sample layer. A spike-and-slab prior is used to model the isoform expression and to enforce the sparsity of expressed isoforms. Dependencies between the existence of isoforms and their expression are modeled explicitly to facilitate parameter estimation. Model parameters are estimated iteratively using Gibbs sampling to infer the joint posterior distribution, from which the presence and abundance of isoforms can reliably be determined. Studies using both simulations and real datasets show that IntAPT consistently outperforms existing methods for the IntAPT. Experimental results demonstrate that, despite sequencing errors, IntAPT exhibits a robust performance among multiple samples, resulting in notably improved identification of expressed isoforms of low abundance.Availability and implementation
The IntAPT package is available at http://github.com/henryxushi/IntAPT.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Shi X
PROVIDER: S-EPMC8097681 | biostudies-literature | 2021 May
REPOSITORIES: biostudies-literature

Shi Xu X Neuwald Andrew F AF Wang Xiao X Wang Tian-Li TL Hilakivi-Clarke Leena L Clarke Robert R Xuan Jianhua J
Bioinformatics (Oxford, England) 20210501 5
<h4>Motivation</h4>High-throughput RNA sequencing has revolutionized the scope and depth of transcriptome analysis. Accurate reconstruction of a phenotype-specific transcriptome is challenging due to the noise and variability of RNA-seq data. This requires computational identification of transcripts from multiple samples of the same phenotype, given the underlying consensus transcript structure.<h4>Results</h4>We present a Bayesian method, integrated assembly of phenotype-specific transcripts (I ...[more]