Genomics

Dataset Information

0

AMULET: A novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data


ABSTRACT: Similar to other droplet-based single cell assays, single nucleus ATAC-seq (snATAC-seq) data harbor multiplets that confound downstream analyses. Detecting multiplets in snATAC-seq data is particularly challenging due to data sparsity and limited dynamic range (0 reads: closed chromatin, 1: open on in one parental chromosome allele, 2: open on in both alleles chromosomes). Yet, these unique data features offer an opportunity to identify multiplets. ATAC-DoubletDetector (https://ucarlab.github.io/ATAC-DoubletDetector/) AMULET (Atac MULtiplet Estimation Tool) exploits these unique features to detect multiplets by studying enumerates the number of regions with >2 uniquely aligned reads across the genome to effectively detect multiplets - an effective alternative to methods based on artificially-generated multiplets. We evaluated the method by generating snATAC-seq data (e.g., state-of-the-art ArchR). For benchmarking we generated snATAC-seq data and generated data fromeasured the efficacy of AMULET inm in two primary human tissues: peripheral human blood mononuclear cells (PBMCs) and pancreatic islet samples. AMULET detects had high multiplets with an estimated precision (estimated via donor-based multiplexing) and high recall (estimated via simulated doublets) compared to alternatives 0.57 precision and achieves 0.85 recall. When and was the most effective when a certain read depth is achieved (a certain read depth per nucleus is achieved samples are sequenced deeply (e.g., median read count per nucleus >20K25K) reads per nucleus in PBMCs), ATAC-DoubletDetector captured 85% of simulated doublets (i.e., recall), significantly outperforming ArchR (24%). For lower read depth, ATAC-DoubletDetector and ArchR produced complementary results. Moreover, ATAC-DoubletDetector was equally effective in identifying homotypic multiplets (i.e., multiplets from the same cell type), which are missed by simulation-based methods. Cell-specific marker peaks enabled accurate (85%) tracing of cellular origins of snATAC-seq multiplets. Accordingly, more abundant cells within a tissue are more likely to form multiplets and the majority of multiplets are homotypic. ATAC-DoubletDetector is a fast and effective multiplet detection/annotation tool for improved single cell epigenomic data analyses across diverse biological systems and conditions.

ORGANISM(S): Homo sapiens

PROVIDER: GSE165212 | GEO | 2021/08/11

REPOSITORIES: GEO

Similar Datasets

| PRJNA693559 | ENA
2020-06-24 | GSE152981 | GEO
2019-09-01 | GSE125523 | GEO
2022-07-28 | GSE195460 | GEO
2022-09-18 | GSE185948 | GEO
2021-09-07 | E-MTAB-9765 | biostudies-arrayexpress
2023-05-31 | GSE211543 | GEO
2023-05-31 | GSE211542 | GEO
2021-09-07 | E-MTAB-10533 | biostudies-arrayexpress
2021-02-21 | GSE151302 | GEO