Dataset Information

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.

ABSTRACT:

Summary

Motivated by theoretical and practical issues that arise when applying Principal component analysis (PCA) to count data, Townes et al. introduced "Poisson GLM-PCA", a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (scRNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call "Alternating Poisson Regression" (APR), produces better quality fits, and in less time, than existing algorithms. APR is also memory-efficient and lends itself to parallel implementation on multi-core processors, both of which are helpful for handling large scRNA-seq datasets. We illustrate the benefits of this approach in three publicly available scRNA-seq datasets. The new algorithms are implemented in an R package, fastglmpca.

Availability and implementation

The fastglmpca R package is released on CRAN for Windows, macOS and Linux, and the source code is available at github.com/stephenslab/fastglmpca under the open source GPL-3 license. Scripts to reproduce the results in this paper are also available in the GitHub repository and on Zenodo.

SUBMITTER: Weine E

PROVIDER: S-EPMC11322042 | biostudies-literature | 2024 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.

Weine Eric E Carbonetto Peter P Stephens Matthew M

Bioinformatics (Oxford, England) 20240801 8

<h4>Summary</h4>Motivated by theoretical and practical issues that arise when applying Principal component analysis (PCA) to count data, Townes et al. introduced "Poisson GLM-PCA", a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (scRNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call "Alternating Poisson Regression" (APR), produces better q ...[more]

PMID: 39110511

Dataset Information

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.

Summary

Availability and implementation

Publications

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.
| S-EPMC10996495 | biostudies-literature

Dimensionality reduction for single cell RNA sequencing data using constrained robust non-negative matrix factorization.
| S-EPMC7671375 | biostudies-literature

Infusing structural assumptions into dimensionality reduction for single-cell RNA sequencing data to identify small gene sets.
| S-EPMC11897155 | biostudies-literature

A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data.
| S-EPMC8021860 | biostudies-literature

Ensemble dimensionality reduction and feature gene extraction for single-cell RNA-seq data.
| S-EPMC7673125 | biostudies-literature

Bioconductor workflow for single-cell RNA sequencing: Normalization, dimensionality reduction, clustering, and lineage inference.
| S-EPMC5558107 | biostudies-literature

Resolution of the curse of dimensionality in single-cell RNA sequencing data analysis.
| S-EPMC9363502 | biostudies-literature

A deep adversarial variational autoencoder model for dimensionality reduction in single-cell RNA sequencing analysis.
| S-EPMC7035735 | biostudies-literature

A topology-preserving dimensionality reduction method for single-cell RNA-seq data using graph autoencoder.
| S-EPMC8501122 | biostudies-literature

AE-TPGG: a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction.
| S-EPMC9607720 | biostudies-literature