Unknown

Dataset Information

0

Biomarker detection and categorization in ribonucleic acid sequencing meta-analysis using Bayesian hierarchical models.


ABSTRACT: Meta-analysis combining multiple transcriptomic studies increases statistical power and accuracy in detecting differentially expressed genes. As the next-generation sequencing experiments become mature and affordable, increasing number of RNA-seq datasets are available in the public domain. The count-data based technology provides better experimental accuracy, reproducibility and ability to detect low-expressed genes. A naive approach to combine multiple RNA-seq studies is to apply differential analysis tools such as edgeR and DESeq to each study and then combine the summary statistics of p-values or effect sizes by conventional meta-analysis methods. Such a two-stage approach loses statistical power, especially for genes with short length or low expression abundance. In this paper, we propose a full Bayesian hierarchical model (namely, BayesMetaSeq) for RNA-seq meta-analysis by modelling count data, integrating information across genes and across studies, and modelling potentially heterogeneous differential signals across studies via latent variables. A Dirichlet process mixture (DPM) prior is further applied on the latent variables to provide categorization of detected biomarkers according to their differential expression patterns across studies, facilitating improved interpretation and biological hypothesis generation. Simulations and a real application on multi-brain-region HIV-1 transgenic rats demonstrate improved sensitivity, accuracy and biological findings of the proposed method.

SUBMITTER: Ma T 

PROVIDER: S-EPMC5543999 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Biomarker detection and categorization in ribonucleic acid sequencing meta-analysis using Bayesian hierarchical models.

Ma Tianzhou T   Liang Faming F   Tseng George G  

Journal of the Royal Statistical Society. Series C, Applied statistics 20161216 4


Meta-analysis combining multiple transcriptomic studies increases statistical power and accuracy in detecting differentially expressed genes. As the next-generation sequencing experiments become mature and affordable, increasing number of RNA-seq datasets are available in the public domain. The count-data based technology provides better experimental accuracy, reproducibility and ability to detect low-expressed genes. A naive approach to combine multiple RNA-seq studies is to apply differential  ...[more]

Similar Datasets

| S-EPMC4009397 | biostudies-literature
| S-EPMC3276744 | biostudies-literature
| S-EPMC5911710 | biostudies-literature
| PRJEB21102 | ENA
| S-EPMC2730548 | biostudies-literature
| S-EPMC6884669 | biostudies-literature