Unknown

Dataset Information

0

ESVD-DE: Cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings.


ABSTRACT:

Background

Single-cell RNA-sequencing (scRNA) datasets are becoming increasingly popular in clinical and cohort studies, but there is a lack of methods to investigate differentially expressed (DE) genes among such datasets with numerous individuals. While numerous methods exist to find DE genes for scRNA data from limited individuals, differential-expression testing for large cohorts of case and control individuals using scRNA data poses unique challenges due to substantial effects of human variation, i.e., individual-level confounding covariates that are difficult to account for in the presence of sparsely-observed genes.

Results

We develop the eSVD-DE, a matrix factorization that pools information across genes and removes confounding covariate effects, followed by a novel two-sample test in mean expression between case and control individuals. In general, differential testing after dimension reduction yields an inflation of Type-1 errors. However, we overcome this by testing for differences between the case and control individuals' posterior mean distributions via a hierarchical model. In previously published datasets of various biological systems, eSVD-DE has more accuracy and power compared to other DE methods typically repurposed for analyzing cohort-wide differential expression.

Conclusions

eSVD-DE proposes a novel and powerful way to test for DE genes among cohorts after performing a dimension reduction. Accurate identification of differential expression on the individual level, instead of the cell level, is important for linking scRNA-seq studies to our understanding of the human population.

SUBMITTER: Lin KZ 

PROVIDER: S-EPMC10690270 | biostudies-literature | 2024 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

eSVD-DE: Cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings.

Lin Kevin Z KZ   Qiu Yixuan Y   Roeder Kathryn K  

bioRxiv : the preprint server for biology 20240301


<h4>Background</h4>Single-cell RNA-sequencing (scRNA) datasets are becoming increasingly popular in clinical and cohort studies, but there is a lack of methods to investigate differentially expressed (DE) genes among such datasets with numerous individuals. While numerous methods exist to find DE genes for scRNA data from limited individuals, differential-expression testing for large cohorts of case and control individuals using scRNA data poses unique challenges due to substantial effects of hu  ...[more]

Similar Datasets

| S-EPMC10941434 | biostudies-literature
| S-EPMC8336573 | biostudies-literature
| S-EPMC8570643 | biostudies-literature
| S-EPMC11310084 | biostudies-literature
| S-EPMC9915700 | biostudies-literature
| S-EPMC7520047 | biostudies-literature
| S-EPMC5937712 | biostudies-literature
| S-EPMC7202736 | biostudies-literature
| S-EPMC6157076 | biostudies-literature
| S-EPMC4059460 | biostudies-literature