Transcriptomics

Dataset Information

0

Resolution of the curse of dimensionality in single-cell RNA-sequencing data analysis


ABSTRACT: Single-cell RNA sequencing (scRNA-seq) can determine gene expression in numerous individual cells simultaneously, promoting progress in the biomedical sciences. However, scRNA-seq data are high-dimensional with substantial technical noise, including dropouts. During analysis of scRNA-seq data, such noise engenders a statistical problem known as the curse of dimensionality (COD). Based on high-dimensional statistics, we herein formulate a noise reduction method, RECODE (resolution of the curse of dimensionality), for high-dimensional data with random sampling noise. We show that RECODE consistently resolves COD in relevant scRNA-seq data with unique molecular identifiers. RECODE does not involve dimension reduction and recovers expression values for all genes, including lowly expressed genes, realizing precise delineation of cell-fate transitions and identification of rare cells with all gene information. Compared to representative imputation methods, RECODE employs different principles and exhibits superior overall performance in cell-clustering, expression-value recovery, and single-cell level analysis. The RECODE algorithm is parameter-free, data-driven, deterministic, and high-speed, and its applicability can be predicted based on the variance normalization performance. We propose RECODE as a powerful strategy for preprocessing noisy high-dimensional data.

ORGANISM(S): Homo sapiens

PROVIDER: GSE175525 | GEO | 2022/08/04

REPOSITORIES: GEO

Similar Datasets

2016-09-09 | GSE75790 | GEO
2021-09-09 | PXD020515 | Pride
2021-01-31 | E-MTAB-9916 | biostudies-arrayexpress
2018-08-18 | GSE118704 | GEO
2019-02-22 | GSE126908 | GEO
2019-02-22 | GSE126906 | GEO
2018-07-25 | GSE117618 | GEO
2019-12-19 | GSE142286 | GEO
2018-08-18 | GSE118706 | GEO
2018-08-19 | GSE117617 | GEO