Unknown

Dataset Information

0

A comparison of methods accounting for batch effects in differential expression analysis of UMI count based single cell RNA sequencing.


ABSTRACT: Accounting for batch effects, especially latent batch effects, in differential expression (DE) analysis is critical for identifying true biological effects. Single-cell RNA sequencing (scRNA-seq) is a powerful tool for quantifying cell-to-cell variation in transcript abundance and characterizing cellular dynamics. Although many scRNA-seq DE analysis methods accommodate known batch variables, their performance has not been systematically evaluated. Moreover, the challenge of accounting for latent batch variables in scRNA-seq DE analysis is largely unmet. In contrast, many methods have been developed to account for batch variables (either known or latent) in other high-dimensional data, especially bulk RNA-seq. We extensively evaluate 11 methods for batch variables in different scRNA-seq DE analysis scenarios, with a primary focus on latent batch variables. We demonstrate that for known batch variables, incorporating them as covariates into a regression model outperformed approaches using a batch-corrected matrix. For latent batches, fixed effects models have inflated FDRs, whereas aggregation-based methods and mixed effects models have significant power loss. Surrogate variable based methods generally control the FDR well while achieving good power with small group effects. However, their performance (except that of SVA) deteriorated substantially in scenarios involving large group effects and/or group label impurity. In these settings, SVA achieves relatively good performance despite an occasionally inflated FDR (up to 0.2). Finally we make the following recommendations for scRNA-seq DE analysis: 1) incorporate known batch variables instead of using batch-corrected data; and 2) employ SVA for latent batch correction. However, better methods are still needed to fully unleash the power of scRNA-seq.

SUBMITTER: Chen W 

PROVIDER: S-EPMC7163294 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

A comparison of methods accounting for batch effects in differential expression analysis of UMI count based single cell RNA sequencing.

Chen Wenan W   Zhang Silu S   Williams Justin J   Ju Bensheng B   Shaner Bridget B   Easton John J   Wu Gang G   Chen Xiang X  

Computational and structural biotechnology journal 20200330


Accounting for batch effects, especially latent batch effects, in differential expression (DE) analysis is critical for identifying true biological effects. Single-cell RNA sequencing (scRNA-seq) is a powerful tool for quantifying cell-to-cell variation in transcript abundance and characterizing cellular dynamics. Although many scRNA-seq DE analysis methods accommodate known batch variables, their performance has not been systematically evaluated. Moreover, the challenge of accounting for latent  ...[more]

Similar Datasets

| S-EPMC5984373 | biostudies-literature
2018-04-30 | GSE113660 | GEO
2019-09-13 | GSE135564 | GEO
| S-EPMC8053088 | biostudies-literature
| S-EPMC6964114 | biostudies-literature
| S-EPMC7289686 | biostudies-literature
| S-EPMC5737676 | biostudies-literature
| S-EPMC5440469 | biostudies-literature
| PRJNA559293 | ENA