Unknown

Dataset Information

0

Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data.


ABSTRACT: Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the correspondences among the features. Consequently, we propose a probabilistic tensor decomposition framework, SCOIT, to extract embeddings from single-cell multiomic data. SCOIT incorporates various distributions, including Gaussian, Poisson, and negative binomial distributions, to deal with sparse, noisy, and heterogeneous single-cell data. Our framework can decompose a multiomic tensor into a cell embedding matrix, a gene embedding matrix, and an omic embedding matrix, allowing for various downstream analyses. We applied SCOIT to eight single-cell multiomic datasets from different sequencing protocols. With cell embeddings, SCOIT achieves superior performance for cell clustering compared to nine state-of-the-art tools under various metrics, demonstrating its ability to dissect cellular heterogeneity. With the gene embeddings, SCOIT enables cross-omics gene expression analysis and integrative gene regulatory network study. Furthermore, the embeddings allow cross-omics imputation simultaneously, outperforming current imputation methods with the Pearson correlation coefficient increased by 3.38-39.26%; moreover, SCOIT accommodates the scenario that subsets of the cells are with merely one omic profile available.

SUBMITTER: Wang RH 

PROVIDER: S-EPMC10450184 | biostudies-literature | 2023 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data.

Wang Ruo Han RH   Wang Jianping J   Li Shuai Cheng SC  

Nucleic acids research 20230801 15


Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the correspondences among the features. Consequently, we propose a probabilistic tensor decomposition framework, SCOIT, to extract embeddings from single-cell multiomic data. SCOIT incorporates various distr  ...[more]

Similar Datasets

| S-EPMC7355243 | biostudies-literature
| S-EPMC8468466 | biostudies-literature
| S-EPMC11647523 | biostudies-literature
| S-EPMC7756762 | biostudies-literature
| S-EPMC6677658 | biostudies-literature
| S-EPMC9009670 | biostudies-literature
| S-EPMC6298052 | biostudies-literature
| S-EPMC5895786 | biostudies-literature
| S-EPMC7763286 | biostudies-literature
| S-EPMC6761323 | biostudies-literature