Unknown

Dataset Information

0

Minimizing Reference Bias with an Impute-First Approach.


ABSTRACT: Pangenome indexes reduce reference bias in sequencing data analysis. However, a greater reduction in bias can be achieved using a personalized reference, e.g. a diploid human reference constructed to match a donor individual's alleles. We present a novel impute-first alignment framework that combines elements of genotype imputation and pangenome alignment. It begins by genotyping the individual from a subsample of the input reads. It next uses a reference panel and efficient imputation algorithm to impute a personalized diploid reference. Finally, it indexes the personalized reference and applies a read aligner, which could be a linear or graph aligner, to align the full read set to the personalized reference. This framework has higher variant-calling recall (99.54% vs. 99.37%), precision (99.36% vs. 99.18%), and F1 (99.45% vs. 99.28%) compared to a graph-based pangenome. The personalized reference is also smaller and faster to query compared to a pangenome index, making it an overall advantageous choice for whole-genome DNA sequencing experiments.

SUBMITTER: Vaddadi NSK 

PROVIDER: S-EPMC10705441 | biostudies-literature | 2023 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Minimizing Reference Bias with an Impute-First Approach.

Vaddadi Kavya K   Mun Taher T   Langmead Ben B  

bioRxiv : the preprint server for biology 20240516


Pangenome indexes reduce reference bias in sequencing data analysis. However, bias can be reduced further by using a personalized reference, e.g. a diploid human reference constructed to match a donor individual's alleles. We present a novel impute-first alignment framework that combines elements of genotype imputation and pangenome alignment. It begins by genotyping the individual using only a subsample of the input reads. It next uses a reference panel and efficient imputation algorithm to imp  ...[more]

Similar Datasets

| S-EPMC5759930 | biostudies-literature
| S-EPMC6120949 | biostudies-literature
| S-EPMC10351896 | biostudies-literature
| S-EPMC3188800 | biostudies-literature
| S-EPMC7780692 | biostudies-literature
| S-EPMC10125539 | biostudies-literature
| S-EPMC11845529 | biostudies-literature
| S-EPMC7677776 | biostudies-literature
| S-EPMC2733930 | biostudies-other
| S-EPMC5925782 | biostudies-literature