Unknown

Dataset Information

0

CTEC: a cross-tabulation ensemble clustering approach for single-cell RNA sequencing data analysis.


ABSTRACT:

Motivation

Cell-type clustering is a crucial first step for single-cell RNA-seq data analysis. However, existing clustering methods often provide different results on cluster assignments with respect to their own data pre-processing, choice of distance metrics, and strategies of feature extraction, thereby limiting their practical applications.

Results

We propose Cross-Tabulation Ensemble Clustering (CTEC) method that formulates two re-clustering strategies (distribution- and outlier-based) via cross-tabulation. Benchmarking experiments on five scRNA-Seq datasets illustrate that the proposed CTEC method offers significant improvements over the individual clustering methods. Moreover, CTEC-DB outperforms the state-of-the-art ensemble methods for single-cell data clustering, with 45.4% and 17.1% improvement over the single-cell aggregated from ensemble clustering method (SAFE) and the single-cell aggregated clustering via Mixture model ensemble method (SAME), respectively, on the two-method ensemble test.

Availability and implementation

The source code of the benchmark in this work is available at the GitHub repository https://github.com/LWCHN/CTEC.git.

SUBMITTER: Wang L 

PROVIDER: S-EPMC10985676 | biostudies-literature | 2024 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

CTEC: a cross-tabulation ensemble clustering approach for single-cell RNA sequencing data analysis.

Wang Liang L   Hong Chenyang C   Song Jiangning J   Yao Jianhua J  

Bioinformatics (Oxford, England) 20240301 4


<h4>Motivation</h4>Cell-type clustering is a crucial first step for single-cell RNA-seq data analysis. However, existing clustering methods often provide different results on cluster assignments with respect to their own data pre-processing, choice of distance metrics, and strategies of feature extraction, thereby limiting their practical applications.<h4>Results</h4>We propose Cross-Tabulation Ensemble Clustering (CTEC) method that formulates two re-clustering strategies (distribution- and outl  ...[more]

Similar Datasets

| S-EPMC6477982 | biostudies-literature
| S-EPMC10636845 | biostudies-literature
| S-EPMC7444317 | biostudies-literature
| S-EPMC6742414 | biostudies-literature
| S-EPMC8742847 | biostudies-literature
| S-EPMC8168892 | biostudies-literature
| S-EPMC9108753 | biostudies-literature
| S-EPMC10359080 | biostudies-literature
| S-EPMC9151659 | biostudies-literature
| S-EPMC10287920 | biostudies-literature