Unknown

Dataset Information

0

Best practices for analyzing imputed genotypes from low-pass sequencing in dogs.


ABSTRACT: Although DNA array-based approaches for genome-wide association studies (GWAS) permit the collection of thousands of low-cost genotypes, it is often at the expense of resolution and completeness, as SNP chip technologies are ultimately limited by SNPs chosen during array development. An alternative low-cost approach is low-pass whole genome sequencing (WGS) followed by imputation. Rather than relying on high levels of genotype confidence at a set of select loci, low-pass WGS and imputation rely on the combined information from millions of randomly sampled low-confidence genotypes. To investigate low-pass WGS and imputation in the dog, we assessed accuracy and performance by downsampling 97 high-coverage (> 15×) WGS datasets from 51 different breeds to approximately 1× coverage, simulating low-pass WGS. Using a reference panel of 676 dogs from 91 breeds, genotypes were imputed from the downsampled data and compared to a truth set of genotypes generated from high-coverage WGS. Using our truth set, we optimized a variant quality filtering strategy that retained approximately 80% of 14 M imputed sites and lowered the imputation error rate from 3.0% to 1.5%. Seven million sites remained with a MAF > 5% and an average imputation quality score of 0.95. Finally, we simulated the impact of imputation errors on outcomes for case-control GWAS, where small effect sizes were most impacted and medium-to-large effect sizes were minorly impacted. These analyses provide best practice guidelines for study design and data post-processing of low-pass WGS-imputed genotypes in dogs.

SUBMITTER: Buckley RM 

PROVIDER: S-EPMC8913487 | biostudies-literature | 2022 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Best practices for analyzing imputed genotypes from low-pass sequencing in dogs.

Buckley Reuben M RM   Harris Alex C AC   Wang Guo-Dong GD   Whitaker D Thad DT   Zhang Ya-Ping YP   Ostrander Elaine A EA  

Mammalian genome : official journal of the International Mammalian Genome Society 20210908 1


Although DNA array-based approaches for genome-wide association studies (GWAS) permit the collection of thousands of low-cost genotypes, it is often at the expense of resolution and completeness, as SNP chip technologies are ultimately limited by SNPs chosen during array development. An alternative low-cost approach is low-pass whole genome sequencing (WGS) followed by imputation. Rather than relying on high levels of genotype confidence at a set of select loci, low-pass WGS and imputation rely  ...[more]

Similar Datasets

2018-06-08 | GSE107768 | GEO
| S-EPMC11489062 | biostudies-literature
| S-EPMC10879460 | biostudies-literature
2018-06-08 | GSE107767 | GEO
2018-06-08 | GSE107766 | GEO
| S-EPMC331400 | biostudies-literature
| S-EPMC7586657 | biostudies-literature
2024-04-30 | GSE230765 | GEO
| S-EPMC6821270 | biostudies-literature
2021-02-24 | GSE158480 | GEO