Unknown

Dataset Information

0

Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank.


ABSTRACT: A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a large-scale genetic database in a single medical institution at the China Medical University Hospital. With current technological limitations, microarray analysis can detect only a limited number of single-nucleotide polymorphisms (SNPs) with a minor allele frequency of >1%. Nevertheless, imputation represents a useful alternative means of expanding data. In this study, we compared four imputation algorithms in terms of various metrics. We observed that among the compared algorithms, Beagle5.2 achieved the fastest calculation speed, smallest storage space, highest specificity, and highest number of high-quality variants. We obtained 15,277,414 high-quality variants in 175,871 people by using Beagle5.2. In our internal verification process, Beagle5.2 exhibited an accuracy rate of up to 98.75%. We also conducted external verification. Our imputed variants had a 79.91% mapping rate and 90.41% accuracy. These results will be combined with clinical data in future research. We have made the results available for researchers to use in formulating imputation algorithms, in addition to establishing a complete SNP database for GWAS and PRS researchers. We believe that these data can help improve overall medical capabilities, particularly precision medicine, in Taiwan.

SUBMITTER: Liu TY 

PROVIDER: S-EPMC8823485 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

altmetric image

Publications

Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank.

Liu Ting-Yuan TY   Lin Chih-Fan CF   Wu Hsing-Tsung HT   Wu Ya-Lun YL   Chen Yu-Chia YC   Liao Chi-Chou CC   Chou Yu-Pao YP   Chao Dysan D   Chang Ya-Sian YS   Lu Hsing-Fang HF   Chang Jan-Gowth JG   Hsu Kai-Cheng KC   Tsai Fuu-Jen FJ  

BioMedicine 20211201 4


A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a large-scale genetic database in a single medical institution at the China Medical University Hospital. With current technological limitations, microarray analysis can detect only a limited number of singl  ...[more]

Similar Datasets

| S-EPMC3973287 | biostudies-literature
| S-EPMC5473732 | biostudies-literature
| S-EPMC8762119 | biostudies-literature
| S-EPMC5458552 | biostudies-literature
| S-EPMC7125075 | biostudies-literature
| S-EPMC8493217 | biostudies-literature
| S-EPMC4076012 | biostudies-literature
| S-EPMC6547561 | biostudies-literature
2022-02-27 | GSE160594 | GEO
| S-EPMC11373650 | biostudies-literature