Transcriptomics

Dataset Information

0

Bayesian Correlation is a robust similarity measure for single cell RNA-seq data


ABSTRACT: Single-cell analysis of the transcriptome deepens our understanding of an individual cell's contribution to its microenvironment. Using single-cell analysis to study complex biological processes requires state-of-the-art computational tools. Assessing similarity is highly important for bioinformatics algorithms in order to determine correlations between biological information. Similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single cell RNA-seq (scRNA-seq) because the read counts obtained are lower compared to bulk RNA-sequencing and therefore classic bioinformatics tools are insufficient to obtain reproducible results. Recently, a Bayesian correlation scheme, that assigns low correlation values to correlations coming from low expressed genes, has been proposed to assess similarity for bulk RNA-seq and miRNA. This Bayesian method uses a prior distribution before using empirical evidence. Our goal was to extend the properties of this Bayesian correlation scheme to scRNA-seq data. We assessed 3 ways to compute similarity. First, we computed the similarity of each pair of genes over all cells. Second, we identified specific cell populations and computed the correlation in those specific cells. Third, we computed the similarity of each pair of genes over all clusters, by including the total mRNA expression in those cells. To study the effect of the number of cells on the method, we did not rely on simulated data, we generated 4 scRNA-seq mouse liver cell libraries with a varying number of input cells. Results: We show that Bayesian correlations are more reproducible than Pearson correlations in all the scenarios studied. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We demonstrate that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. Significance: Our results demonstrate that Bayesian correlation is a robust similarity measure for scRNA-seq datasets. The Bayesian method allows researchers to study similarity between pairs of genes without discarding low expressed entities and to minimize biasing the results by fake correlations. Taken together, using our method of Bayesian correlation the reproducibility of scRNA-seq experiments is increased significantly.

ORGANISM(S): Mus musculus

PROVIDER: GSE134134 | GEO | 2019/12/31

REPOSITORIES: GEO

Similar Datasets

| PRJNA554015 | ENA
2007-11-01 | GSE9337 | GEO
2019-03-09 | GSE128066 | GEO
2022-07-15 | GSE180139 | GEO
2022-07-01 | GSE200090 | GEO
2022-07-15 | GSE180140 | GEO
2023-11-27 | PXD044001 | Pride
2021-09-09 | PXD020515 | Pride
2019-06-01 | GSE127832 | GEO
2023-12-11 | GSE227732 | GEO