Unknown

Dataset Information

0

High-accuracy haplotype imputation using unphased genotype data as the references.


ABSTRACT: Enormously growing genomic datasets present a new challenge on missing data imputation, a notoriously resource-demanding task. Haplotype imputation requires ethnicity-matched references. However, to date, haplotype references are not available for the majority of populations in the world. We explored to use existing unphased genotype datasets as references; if it succeeds, it will cover almost all of the populations in the world. The results showed that our HiFi software successfully yields 99.43% accuracy with unphased genotype references. Our method provides a cost-effective solution to breakthrough the bottleneck of limited reference availability for haplotype imputation in the big data era.

SUBMITTER: Li W 

PROVIDER: S-EPMC5373555 | biostudies-literature | 2015 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

High-accuracy haplotype imputation using unphased genotype data as the references.

Li Wenzhi W   Xu Wei W   Fu Guoxing G   Ma Li L   Richards Jendai J   Rao Weinian W   Bythwood Tameka T   Guo Shiwen S   Song Qing Q  

Gene 20150730 2


Enormously growing genomic datasets present a new challenge on missing data imputation, a notoriously resource-demanding task. Haplotype imputation requires ethnicity-matched references. However, to date, haplotype references are not available for the majority of populations in the world. We explored to use existing unphased genotype datasets as references; if it succeeds, it will cover almost all of the populations in the world. The results showed that our HiFi software successfully yields 99.4  ...[more]

Similar Datasets

| S-EPMC4995474 | biostudies-literature
| S-EPMC4888899 | biostudies-literature
| S-EPMC2835450 | biostudies-literature
| S-EPMC6322752 | biostudies-literature
| S-EPMC7520044 | biostudies-literature
| S-EPMC6055857 | biostudies-other
| S-EPMC2553438 | biostudies-literature