Unknown

Dataset Information

0

Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space.


ABSTRACT: Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-learning method that integrates single-cell data by projecting cells into a batch-invariant, common cell-embedding space in a truly online manner (i.e., without retraining the model). SCALEX substantially outperforms online iNMF and other state-of-the-art non-online integration methods on benchmark single-cell datasets of diverse modalities, (e.g., single-cell RNA sequencing, scRNA-seq, single-cell assay for transposase-accessible chromatin use sequencing, scATAC-seq), especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We showcase SCALEX's advantages by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19 patients, each assembled from diverse data sources and growing with every new data. The online data integration capacity and superior performance makes SCALEX particularly appropriate for large-scale single-cell applications to build upon previous scientific insights.

SUBMITTER: Xiong L 

PROVIDER: S-EPMC9574176 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space.

Xiong Lei L   Tian Kang K   Li Yuzhe Y   Ning Weixi W   Gao Xin X   Zhang Qiangfeng Cliff QC  

Nature communications 20221017 1


Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-learning method that integrates single-cell data by projecting cells into a batch-invariant, common cell-embedding space in a truly online manner (i.e., without retraining the model). SCALEX substantial  ...[more]

Similar Datasets

| S-EPMC8456427 | biostudies-literature
| S-EPMC7958465 | biostudies-literature
| S-EPMC6311449 | biostudies-literature
| S-EPMC6551256 | biostudies-literature
| S-EPMC8355612 | biostudies-literature
| S-EPMC9048671 | biostudies-literature
| S-EPMC11261982 | biostudies-literature
| S-EPMC7891139 | biostudies-literature
| S-EPMC9982060 | biostudies-literature
| S-EPMC8696097 | biostudies-literature