Unknown

Dataset Information

0

Systematic analysis on the horse-shoe-like effect in PCA plots of scRNA-seq data.


ABSTRACT:

Motivation

In single-cell studies, principal component analysis (PCA) is widely used to reduce the dimensionality of dataset and visualize in 2D or 3D PC plots. Scientists often focus on different clusters within PC plot, overlooking the specific phenomenon, such as horse-shoe-like effect, that may reveal hidden knowledge about underlying biological dataset. This phenomenon remains largely unexplored in single-cell studies.

Results

In this study, we investigated into the horse-shoe-like effect in PC plots using simulated and real scRNA-seq datasets. We systematically explain horse-shoe-like phenomenon from various inter-related perspectives. Initially, we establish an intuitive understanding with the help of simulated datasets. Then, we generalized the acquired knowledge on real biological scRNA-seq data. Experimental results provide logical explanations and understanding for the appearance of horse-shoe-like effect in PC plots. Furthermore, we identify a potential problem with a well-known theory of 'distance saturation property' attributed to induce horse-shoe phenomenon. Finally, we analyse a mathematical model for horse-shoe effect that suggests trigonometric solutions to estimated eigenvectors. We observe significant resemblance after comparing the results of mathematical model with simulated and real scRNA-seq datasets.

Availability and implementation

The code for reproducing the results of this study is available at: https://github.com/najeebullahshah/PCA-Horse-Shoe.

SUBMITTER: Shah N 

PROVIDER: S-EPMC11316618 | biostudies-literature | 2024

REPOSITORIES: biostudies-literature

altmetric image

Publications

Systematic analysis on the horse-shoe-like effect in PCA plots of scRNA-seq data.

Shah Najeebullah N   Meng Qiuchen Q   Zou Ziheng Z   Zhang Xuegong X  

Bioinformatics advances 20240729 1


<h4>Motivation</h4>In single-cell studies, principal component analysis (PCA) is widely used to reduce the dimensionality of dataset and visualize in 2D or 3D PC plots. Scientists often focus on different clusters within PC plot, overlooking the specific phenomenon, such as horse-shoe-like effect, that may reveal hidden knowledge about underlying biological dataset. This phenomenon remains largely unexplored in single-cell studies.<h4>Results</h4>In this study, we investigated into the horse-sho  ...[more]

Similar Datasets

| S-BSST858 | biostudies-other
| S-EPMC11519519 | biostudies-literature
| S-EPMC8356963 | biostudies-literature
| S-EPMC10351969 | biostudies-literature
| S-EPMC11741621 | biostudies-literature
| S-EPMC7141853 | biostudies-literature
| S-EPMC8157426 | biostudies-literature
| S-EPMC9437856 | biostudies-literature
| S-EPMC11568468 | biostudies-literature
| S-EPMC10914288 | biostudies-literature