Unknown

Dataset Information

0

Characterization of gene cluster heterogeneity in single-cell transcriptomic data within and across cancer types.


ABSTRACT: Despite the remarkable progress in probing tumor transcriptomic heterogeneity by single-cell RNA sequencing (sc-RNAseq) data, several gaps exist in prior studies. Tumor heterogeneity is frequently mentioned but not quantified. Clustering analyses typically target cells rather than genes, and differential levels of transcriptomic heterogeneity of gene clusters are not characterized. Relations between gene clusters inferred from multiple datasets remain less explored. We provided a series of quantitative methods to analyze cancer sc-RNAseq data. First, we proposed two quantitative measures to assess intra-tumoral heterogeneity/homogeneity. Second, we established a hierarchy of gene clusters from sc-RNAseq data, devised an algorithm to reduce the gene cluster hierarchy to a compact structure, and characterized the gene clusters with functional enrichment and heterogeneity. Third, we developed an algorithm to align the gene cluster hierarchies from multiple datasets to a small number of meta gene clusters. By applying these methods to nine cancer sc-RNAseq datasets, we discovered that cancer cell transcriptomes were more homogeneous within tumors than the accompanying normal cells. Furthermore, many gene clusters from the nine datasets were aligned to two large meta gene clusters, which had high and low heterogeneity and were enriched with distinct functions. Finally, we found the homogeneous meta gene cluster retained stronger expression coherence and associations with survival times in bulk level RNAseq data than the heterogeneous meta gene cluster, yet the combinatorial expression patterns of breast cancer subtypes in bulk level data were not preserved in single-cell data. The inference outcomes derived from nine cancer sc-RNAseq datasets provide insights about the contributing factors for transcriptomic heterogeneity of cancer cells and complex relations between bulk level and single-cell RNAseq data. They demonstrate the utility of our methods to enable a comprehensive characterization of co-expressed gene clusters in a wide range of sc-RNAseq data in cancers and beyond.

SUBMITTER: Tiong KL 

PROVIDER: S-EPMC9235070 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7808958 | biostudies-literature
| S-EPMC7098173 | biostudies-literature
| S-EPMC8421403 | biostudies-literature
| S-EPMC4674153 | biostudies-literature
| S-EPMC7235285 | biostudies-literature
| S-EPMC7839219 | biostudies-literature
2021-02-15 | GSE158275 | GEO