Unknown

Dataset Information

0

Benchmarking integration of single-cell differential expression.


ABSTRACT: Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.

SUBMITTER: Nguyen HCT 

PROVIDER: S-EPMC10030080 | biostudies-literature | 2023 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Benchmarking integration of single-cell differential expression.

Nguyen Hai C T HCT   Baik Bukyung B   Yoon Sora S   Park Taesung T   Nam Dougu D  

Nature communications 20230321 1


Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-c  ...[more]

Similar Datasets

| S-EPMC8748196 | biostudies-literature
| S-EPMC9071439 | biostudies-literature
| S-EPMC10576752 | biostudies-literature
| S-EPMC10944570 | biostudies-literature
| S-EPMC9663917 | biostudies-literature
| S-EPMC9915567 | biostudies-literature
| S-EPMC10594700 | biostudies-literature
| S-EPMC10762948 | biostudies-literature
| S-EPMC10002703 | biostudies-literature
| S-EPMC9487674 | biostudies-literature