Unknown

Dataset Information

0

Accurate feature selection improves single-cell RNA-seq cell clustering.


ABSTRACT: Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as 'features'), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have a significant impact on the clustering accuracy. All existing scRNA-seq clustering tools include a feature selection step relying on some simple unsupervised feature selection methods, mostly based on the statistical moments of gene-wise expression distributions. In this work, we carefully evaluate the impact of feature selection on cell clustering accuracy. In addition, we develop a feature selection algorithm named FEAture SelecTion (FEAST), which provides more representative features. We apply the method on 12 public scRNA-seq datasets and demonstrate that using features selected by FEAST with existing clustering tools significantly improve the clustering accuracy.

SUBMITTER: Su K 

PROVIDER: S-EPMC8644062 | biostudies-literature | 2021 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate feature selection improves single-cell RNA-seq cell clustering.

Su Kenong K   Yu Tianwei T   Wu Hao H  

Briefings in bioinformatics 20210901 5


Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as 'features'), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have a significant impact on the clustering accuracy. All existing scRNA-seq clus  ...[more]

Similar Datasets

| S-EPMC10547911 | biostudies-literature
| S-EPMC9108753 | biostudies-literature
| S-EPMC9754601 | biostudies-literature
| S-EPMC5371246 | biostudies-literature
| S-EPMC4881296 | biostudies-literature
| S-EPMC11920870 | biostudies-literature
| S-EPMC7214470 | biostudies-literature
| S-EPMC11304914 | biostudies-literature
| S-EPMC10888887 | biostudies-literature
| S-EPMC8386505 | biostudies-literature