CASSIA: a multi-agent large language model for reference free, interpretable, and automated cell annotation of single-cell RNA-sequencing data
Ontology highlight
ABSTRACT: Cell type annotation is an essential step in single-cell RNA-sequencing analysis, and numerous annotation methods are available. Most require a combination of computational and domain-specific expertise, and they frequently yield inconsistent results that can be challenging to interpret. Large language models have the potential to expand accessibility while reducing manual input and improving accuracy, but existing approaches suffer from hyperconfidence, hallucinations, and lack of reasoning. To address these limitations, we developed CASSIA for automated, accurate, and interpretable cell annotation of single-cell RNA-sequencing data. As demonstrated in analyses of 970 cell types, CASSIA improves annotation accuracy in benchmark datasets as well as complex and rare cell populations, and also provides users with reasoning and quality assessment to ensure interpretability, guard against hallucinations, and calibrate confidence.
ORGANISM(S): Homo sapiens
PROVIDER: GSE307976 | GEO | 2025/09/19
REPOSITORIES: GEO
ACCESS DATA