Unknown

Dataset Information

0

Learning interpretable cellular and gene signature embeddings from single-cell transcriptomic data.


ABSTRACT: The advent of single-cell RNA sequencing (scRNA-seq) technologies has revolutionized transcriptomic studies. However, large-scale integrative analysis of scRNA-seq data remains a challenge largely due to unwanted batch effects and the limited transferabilty, interpretability, and scalability of the existing computational methods. We present single-cell Embedded Topic Model (scETM). Our key contribution is the utilization of a transferable neural-network-based encoder while having an interpretable linear decoder via a matrix tri-factorization. In particular, scETM simultaneously learns an encoder network to infer cell type mixture and a set of highly interpretable gene embeddings, topic embeddings, and batch-effect linear intercepts from multiple scRNA-seq datasets. scETM is scalable to over 106 cells and confers remarkable cross-tissue and cross-species zero-shot transfer-learning performance. Using gene set enrichment analysis, we find that scETM-learned topics are enriched in biologically meaningful and disease-related pathways. Lastly, scETM enables the incorporation of known gene sets into the gene embeddings, thereby directly learning the associations between pathways and topics via the topic embeddings.

SUBMITTER: Zhao Y 

PROVIDER: S-EPMC8421403 | biostudies-literature | 2021 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Learning interpretable cellular and gene signature embeddings from single-cell transcriptomic data.

Zhao Yifan Y   Cai Huiyu H   Zhang Zuobai Z   Tang Jian J   Li Yue Y  

Nature communications 20210906 1


The advent of single-cell RNA sequencing (scRNA-seq) technologies has revolutionized transcriptomic studies. However, large-scale integrative analysis of scRNA-seq data remains a challenge largely due to unwanted batch effects and the limited transferabilty, interpretability, and scalability of the existing computational methods. We present single-cell Embedded Topic Model (scETM). Our key contribution is the utilization of a transferable neural-network-based encoder while having an interpretabl  ...[more]

Similar Datasets

| S-EPMC10085635 | biostudies-literature
| S-EPMC10287920 | biostudies-literature
| S-EPMC10958532 | biostudies-literature
| S-EPMC11654984 | biostudies-literature
| S-EPMC11411778 | biostudies-literature
| S-EPMC10055147 | biostudies-literature
| S-EPMC8494224 | biostudies-literature
| S-EPMC9643784 | biostudies-literature
| S-EPMC7397672 | biostudies-literature
| S-EPMC9249201 | biostudies-literature