Unknown

Dataset Information

0

Dupsifter: a lightweight duplicate marking tool for whole genome bisulfite sequencing.


ABSTRACT:

Summary

In whole genome sequencing data, polymerase chain reaction amplification results in duplicate DNA fragments coming from the same location in the genome. The process of preparing a whole genome bisulfite sequencing (WGBS) library, on the other hand, can create two DNA fragments from the same location that should not be considered duplicates. Currently, only one WGBS-aware duplicate marking tool exists. However, it only works with the output from a single tool, does not accept streaming input or output, and requires a substantial amount of memory relative to the input size. Dupsifter provides an aligner-agnostic duplicate marking tool that is lightweight, has streaming capabilities, and is memory efficient.

Availability and implementation

Source code and binaries are freely available at https://github.com/huishenlab/dupsifter under the MIT license. Dupsifter is implemented in C and is supported on macOS and Linux.

SUBMITTER: Morrison J 

PROVIDER: S-EPMC10724848 | biostudies-literature | 2023 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Dupsifter: a lightweight duplicate marking tool for whole genome bisulfite sequencing.

Morrison Jacob J   Zhou Wanding W   Johnson Benjamin K BK   Shen Hui H  

Bioinformatics (Oxford, England) 20231201 12


<h4>Summary</h4>In whole genome sequencing data, polymerase chain reaction amplification results in duplicate DNA fragments coming from the same location in the genome. The process of preparing a whole genome bisulfite sequencing (WGBS) library, on the other hand, can create two DNA fragments from the same location that should not be considered duplicates. Currently, only one WGBS-aware duplicate marking tool exists. However, it only works with the output from a single tool, does not accept stre  ...[more]

Similar Datasets

| S-EPMC4682368 | biostudies-literature
| S-EPMC3458524 | biostudies-literature
| S-BSST612 | biostudies-other
| S-BSST618 | biostudies-other
| S-EPMC3371708 | biostudies-literature
| S-EPMC4931220 | biostudies-literature
| S-EPMC6120621 | biostudies-literature
| S-EPMC10288677 | biostudies-literature
| S-EPMC6821270 | biostudies-literature
| S-EPMC4344394 | biostudies-literature