Unknown

Dataset Information

0

PathMap: a path-based mapping tool for long noisy reads with high sensitivity.


ABSTRACT: With the rapid development of single-molecule sequencing (SMS) technologies, the output read length is continuously increasing. Mapping such reads onto a reference genome is one of the most fundamental tasks in sequence analysis. Mapping sensitivity is becoming a major concern since high sensitivity can detect more aligned regions on the reference and obtain more aligned bases, which are useful for downstream analysis. In this study, we present pathMap, a novel k-mer graph-based mapper that is specifically designed for mapping SMS reads with high sensitivity. By viewing the alignment chain as a path containing as many anchors as possible in the matched k-mer graph, pathMap treats chaining as a path selection problem in the directed graph. pathMap iteratively searches the longest path in the remaining nodes; more candidate chains with high quality can be effectively detected and aligned. Compared to other state-of-the-art mapping methods such as minimap2 and Winnowmap2, experiment results on simulated and real-life datasets demonstrate that pathMap obtains the number of mapped chains at least 11.50% more than its closest competitor and increases the mapping sensitivity by 17.28% and 13.84% of bases over the next-best mapper for Pacific Biosciences and Oxford Nanopore sequencing data, respectively. In addition, pathMap is more robust to sequence errors and more sensitive to species- and strain-specific identification of pathogens using MinION reads.

SUBMITTER: Wei ZG 

PROVIDER: S-EPMC10959152 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

pathMap: a path-based mapping tool for long noisy reads with high sensitivity.

Wei Ze-Gang ZG   Zhang Xiao-Dan XD   Fan Xing-Guo XG   Qian Yu Y   Liu Fei F   Wu Fang-Xiang FX  

Briefings in bioinformatics 20240101 2


With the rapid development of single-molecule sequencing (SMS) technologies, the output read length is continuously increasing. Mapping such reads onto a reference genome is one of the most fundamental tasks in sequence analysis. Mapping sensitivity is becoming a major concern since high sensitivity can detect more aligned regions on the reference and obtain more aligned bases, which are useful for downstream analysis. In this study, we present pathMap, a novel k-mer graph-based mapper that is s  ...[more]

Similar Datasets

| S-EPMC11320709 | biostudies-literature
| S-EPMC7879691 | biostudies-literature
| S-EPMC6547545 | biostudies-literature
| S-EPMC9117619 | biostudies-literature
| S-EPMC6902338 | biostudies-literature
| S-EPMC10997618 | biostudies-literature
| S-EPMC5131822 | biostudies-literature
| S-EPMC9632051 | biostudies-literature
| S-EPMC3936741 | biostudies-literature
| S-EPMC8771625 | biostudies-literature