Unknown

Dataset Information

0

Long-read mapping to repetitive reference sequences using Winnowmap2.


ABSTRACT: Approximately 5-10% of the human genome remains inaccessible due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. We show that existing long-read mappers often yield incorrect alignments and variant calls within long, near-identical repeats, as they remain vulnerable to allelic bias. In the presence of a nonreference allele within a repeat, a read sampled from that region could be mapped to an incorrect repeat copy. To address this limitation, we developed a new long-read mapping method, Winnowmap2, by using minimal confidently alignable substrings. Winnowmap2 computes each read mapping through a collection of confident subalignments. This approach is more tolerant of structural variation and more sensitive to paralog-specific variants within repeats. Our experiments highlight that Winnowmap2 successfully addresses the issue of allelic bias, enabling more accurate downstream variant calls in repetitive sequences.

SUBMITTER: Jain C 

PROVIDER: S-EPMC10510034 | biostudies-literature | 2022 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Long-read mapping to repetitive reference sequences using Winnowmap2.

Jain Chirag C   Rhie Arang A   Hansen Nancy F NF   Koren Sergey S   Phillippy Adam M AM  

Nature methods 20220401 6


Approximately 5-10% of the human genome remains inaccessible due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. We show that existing long-read mappers often yield incorrect alignments and variant calls within long, near-identical repeats, as they remain vulnerable to allelic bias. In the presence of a nonreference allele within a repeat, a read sampled from that region could be mapped to an incorrect repeat copy. To address this limitation, we d  ...[more]

Similar Datasets

| S-EPMC4481695 | biostudies-literature
| S-EPMC6416333 | biostudies-literature
| S-EPMC8147415 | biostudies-literature
| S-EPMC3468387 | biostudies-literature
| S-EPMC7913452 | biostudies-literature
| S-EPMC8361843 | biostudies-literature
| S-EPMC9645665 | biostudies-literature
| S-EPMC8508064 | biostudies-literature
| S-EPMC8248648 | biostudies-literature
| S-EPMC8228171 | biostudies-literature