Unknown

Dataset Information

0

Barcode identification for single cell genomics.


ABSTRACT: BACKGROUND:Single-cell sequencing experiments use short DNA barcode 'tags' to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes. RESULTS:Here we present an approach to identify and error-correct barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Our approach is based on the observation that circularizing a barcode sequence can yield error-free k-mers even when the size of k is large relative to the length of the barcode sequence, a regime which is typical single-cell barcoding applications. This allows for assignment of reads to consensus fingerprints constructed from k-mers. CONCLUSION:We show that for single-cell RNA-Seq circularization improves the recovery of accurate single-cell transcriptome estimates, especially when there are a high number of errors per read. This approach is robust to the type of error (mismatch, insertion, deletion), as well as to the relative abundances of the cells. Sircel, a software package that implements this approach is described and publically available.

SUBMITTER: Tambe A 

PROVIDER: S-EPMC6337828 | biostudies-literature | 2019 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Barcode identification for single cell genomics.

Tambe Akshay A   Pachter Lior L  

BMC bioinformatics 20190117 1


<h4>Background</h4>Single-cell sequencing experiments use short DNA barcode 'tags' to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes.<h4>Results</h4>Here we present an approach to identify and error-correct b  ...[more]

Similar Datasets

| S-EPMC8149635 | biostudies-literature
| S-EPMC6509836 | biostudies-literature
2019-04-20 | GSE130065 | GEO
| S-EPMC5991360 | biostudies-literature
2015-11-20 | E-GEOD-74833 | biostudies-arrayexpress
| S-EPMC5282606 | biostudies-literature
| S-EPMC6039488 | biostudies-literature
| S-EPMC7018801 | biostudies-literature
| S-EPMC5901877 | biostudies-other
| PRJNA533780 | ENA