Ontology highlight
ABSTRACT: Motivation
Cell-type clustering is a crucial first step for single-cell RNA-seq data analysis. However, existing clustering methods often provide different results on cluster assignments with respect to their own data pre-processing, choice of distance metrics, and strategies of feature extraction, thereby limiting their practical applications.Results
We propose Cross-Tabulation Ensemble Clustering (CTEC) method that formulates two re-clustering strategies (distribution- and outlier-based) via cross-tabulation. Benchmarking experiments on five scRNA-Seq datasets illustrate that the proposed CTEC method offers significant improvements over the individual clustering methods. Moreover, CTEC-DB outperforms the state-of-the-art ensemble methods for single-cell data clustering, with 45.4% and 17.1% improvement over the single-cell aggregated from ensemble clustering method (SAFE) and the single-cell aggregated clustering via Mixture model ensemble method (SAME), respectively, on the two-method ensemble test.Availability and implementation
The source code of the benchmark in this work is available at the GitHub repository https://github.com/LWCHN/CTEC.git.
SUBMITTER: Wang L
PROVIDER: S-EPMC10985676 | biostudies-literature | 2024 Mar
REPOSITORIES: biostudies-literature
Wang Liang L Hong Chenyang C Song Jiangning J Yao Jianhua J
Bioinformatics (Oxford, England) 20240301 4
<h4>Motivation</h4>Cell-type clustering is a crucial first step for single-cell RNA-seq data analysis. However, existing clustering methods often provide different results on cluster assignments with respect to their own data pre-processing, choice of distance metrics, and strategies of feature extraction, thereby limiting their practical applications.<h4>Results</h4>We propose Cross-Tabulation Ensemble Clustering (CTEC) method that formulates two re-clustering strategies (distribution- and outl ...[more]