Unknown

Dataset Information

0

Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project.


ABSTRACT: We initiate the Westlake BioBank for Chinese (WBBC) pilot project with 4,535 whole-genome sequencing (WGS) individuals and 5,841 high-density genotyping individuals, and identify 81.5 million SNPs and INDELs, of which 38.5% are absent in dbSNP Build 151. We provide a population-specific reference panel and an online imputation server ( https://wbbc.westlake.edu.cn/ ) which could yield substantial improvement of imputation performance in Chinese population, especially for low-frequency and rare variants. By analyzing the singleton density of the WGS data, we find selection signatures in SNX29, DNAH1 and WDR1 genes, and the derived alleles of the alcohol metabolism genes (ADH1A and ADH1B) emerge around 7,000 years ago and tend to be more common from 4,000 years ago in East Asia. Genetic evidence supports the corresponding geographical boundaries of the Qinling-Huaihe Line and Nanling Mountains, which separate the Han Chinese into subgroups, and we reveal that North Han was more homogeneous than South Han.

SUBMITTER: Cong PK 

PROVIDER: S-EPMC9135724 | biostudies-literature | 2022 May

REPOSITORIES: biostudies-literature

altmetric image

Publications


We initiate the Westlake BioBank for Chinese (WBBC) pilot project with 4,535 whole-genome sequencing (WGS) individuals and 5,841 high-density genotyping individuals, and identify 81.5 million SNPs and INDELs, of which 38.5% are absent in dbSNP Build 151. We provide a population-specific reference panel and an online imputation server ( https://wbbc.westlake.edu.cn/ ) which could yield substantial improvement of imputation performance in Chinese population, especially for low-frequency and rare v  ...[more]

Similar Datasets

| S-EPMC8240579 | biostudies-literature
| S-EPMC6444358 | biostudies-literature
| PRJNA13681 | ENA
| S-EPMC6483073 | biostudies-literature
| S-EPMC6078601 | biostudies-literature
2011-04-29 | GSE28920 | GEO
| S-EPMC11584710 | biostudies-literature
2011-04-29 | E-GEOD-28920 | biostudies-arrayexpress
2016-06-22 | E-PROT-5 | biostudies-arrayexpress
| S-EPMC7365470 | biostudies-literature