Dataset Information

Fast analysis of scATAC-seq data using a predefined set of genomic regions.

ABSTRACT: Background: Analysis of scATAC-seq data has been recently scaled to thousands of cells. While processing of other types of single cell data was boosted by the implementation of alignment-free techniques, pipelines available to process scATAC-seq data still require large computational resources. We propose here an approach based on pseudoalignment, which reduces the execution times and hardware needs at little cost for precision. Methods: Public data for 10k PBMC were downloaded from 10x Genomics web site. Reads were aligned to various references derived from DNase I Hypersensitive Sites (DHS) using kallisto and quantified with bustools. We compared our results with the ones publicly available derived by cellranger-atac. We subsequently tested our approach on scATAC-seq data for K562 cell line. Results: We found that kallisto does not introduce biases in quantification of known peaks; cells groups identified are consistent with the ones identified from standard method. We also found that cell identification is robust when analysis is performed using DHS-derived reference in place of de novo identification of ATAC peaks. Lastly, we found that our approach is suitable for reliable quantification of gene activity based on scATAC-seq signal, thus allows for efficient labelling of cell groups based on marker genes. Conclusions: Analysis of scATAC-seq data by means of kallisto produces results in line with standard pipelines while being considerably faster; using a set of known DHS sites as reference does not affect the ability to characterize the cell populations.

SUBMITTER: Giansanti V

PROVIDER: S-EPMC7308914 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Fast analysis of scATAC-seq data using a predefined set of genomic regions.

Giansanti Valentina V Tang Ming M Cittaro Davide D

F1000Research 20200320

<b>Background:</b> Analysis of scATAC-seq data has been recently scaled to thousands of cells. While processing of other types of single cell data was boosted by the implementation of alignment-free techniques, pipelines available to process scATAC-seq data still require large computational resources. We propose here an approach based on pseudoalignment, which reduces the execution times and hardware needs at little cost for precision. <b>Methods:</b> Public data for 10k PBMC were downloaded fro ...[more]

PMID: 32595951

Dataset Information

Fast analysis of scATAC-seq data using a predefined set of genomic regions.

Publications

Fast analysis of scATAC-seq data using a predefined set of genomic regions.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Joint analysis of scATAC-seq datasets using epiConv.
| S-EPMC9338487 | biostudies-literature

scATAcat: cell-type annotation for scATAC-seq data.
| S-EPMC11459382 | biostudies-literature

A scATAC-seq atlas of chromatin accessibility in axolotl brain regions.
| S-EPMC10502032 | biostudies-literature

Fast clustering and cell-type annotation of scATAC data using pre-trained embeddings
| S-EPMC11224678 | biostudies-literature

MOCHA's advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human cohorts.
| S-EPMC11316085 | biostudies-literature

Multiplexed Analysis of Retinal Gene Expression and Chromatin Accessibility using scRNA-Seq and scATAC-Seq.
| S-EPMC8356148 | biostudies-literature

scNCL: transferring labels from scRNA-seq to scATAC-seq data with neighborhood contrastive regularization.
| S-EPMC10457667 | biostudies-literature

scDART: integrating unmatched scRNA-seq and scATAC-seq data and learning cross-modality relationship simultaneously.
| S-EPMC9238247 | biostudies-literature

scAWMV: an adaptively weighted multi-view learning framework for the integrative analysis of parallel scRNA-seq and scATAC-seq data.
| S-EPMC9805575 | biostudies-literature

Fast and interpretable genomic data analysis using multiple approximate kernel learning.
| S-EPMC9235505 | biostudies-literature