Dataset Information

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.

ABSTRACT: The 1000 Genomes Project and disease-specific sequencing efforts are producing large collections of haplotypes that can be used as reference panels for genotype imputation in genome-wide association studies (GWAS). However, imputing from large reference panels with existing methods imposes a high computational burden. We introduce a strategy called 'pre-phasing' that maintains the accuracy of leading methods while reducing computational costs. We first statistically estimate the haplotypes for each individual within the GWAS sample (pre-phasing) and then impute missing genotypes into these estimated haplotypes. This reduces the computational cost because (i) the GWAS samples must be phased only once, whereas standard methods would implicitly repeat phasing with each reference panel update, and (ii) it is much faster to match a phased GWAS haplotype to one reference haplotype than to match two unphased GWAS genotypes to a pair of reference haplotypes. We implemented our approach in the MaCH and IMPUTE2 frameworks, and we tested it on data sets from the Wellcome Trust Case Control Consortium 2 (WTCCC2), the Genetic Association Information Network (GAIN), the Women's Health Initiative (WHI) and the 1000 Genomes Project. This strategy will be particularly valuable for repeated imputation as reference panels evolve.

SUBMITTER: Howie B

PROVIDER: S-EPMC3696580 | biostudies-literature | 2012 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.

Howie Bryan B Fuchsberger Christian C Stephens Matthew M Marchini Jonathan J Abecasis Gonçalo R GR

Nature genetics 20120722 8

The 1000 Genomes Project and disease-specific sequencing efforts are producing large collections of haplotypes that can be used as reference panels for genotype imputation in genome-wide association studies (GWAS). However, imputing from large reference panels with existing methods imposes a high computational burden. We introduce a strategy called 'pre-phasing' that maintains the accuracy of leading methods while reducing computational costs. We first statistically estimate the haplotypes for e ...[more]

PMID: 22820512

Dataset Information

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.

Publications

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

EagleImp: fast and accurate genome-wide phasing and imputation in a single tool.
| S-EPMC9665855 | biostudies-literature

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.
| S-EPMC2689936 | biostudies-literature

FAPI: Fast and accurate P-value Imputation for genome-wide association study.
| S-EPMC4930094 | biostudies-literature

LinkImpute: Fast and Accurate Genotype Imputation for Nonmodel Organisms.
| S-EPMC4632058 | biostudies-literature

A fast data-driven method for genotype imputation, phasing and local ancestry inference: MendelImpute.jl.
| S-EPMC8665755 | biostudies-literature

FISH: fast and accurate diploid genotype imputation via segmental hidden Markov model.
| S-EPMC4071209 | biostudies-literature

Impact of the inaccessible genome on genotype imputation and genome-wide association studies.
| S-EPMC11227617 | biostudies-literature

Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies.
| S-EPMC3226809 | biostudies-literature

Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies.
| S-EPMC2790566 | biostudies-literature

Improving the resolution of canine genome-wide association studies using genotype imputation: A study of two breeds.
| S-EPMC8514152 | biostudies-literature