Dataset Information


Next-generation sequencing reveals new information about HLA allele and haplotype diversity in a large European American population.

ABSTRACT: The human leukocyte antigen (HLA) genes are extremely polymorphic and are useful molecular markers to make inferences about human population history. However, the accuracy of the estimation of genetic diversity at HLA loci very much depends on the technology used to characterize HLA alleles; high-resolution genotyping of long-range HLA gene products improves the assessment of HLA population diversity as well as other population parameters compared to lower resolution typing methods. In this study we examined allelic and haplotype HLA diversity in a large healthy European American population sourced from the UCSF-DNA bank. A high-resolution next-generation sequencing method was applied to define non-ambiguous 3- and 4-field alleles at the HLA-A, HLA-C, HLA-B, HLA-DRB1, HLA-DRB3/4/5, HLA-DQA1, HLA-DQB1, HLA-DPA1, and HLA-DPB1 loci in samples provided by 2248 unrelated individuals. A number of population parameters were examined including balancing selection and various measurements of linkage disequilibrium were calculated. There were no detectable deviations from Hardy-Weinberg proportions at HLA-A, HLA-DRB1, HLA-DQA1 and HLA-DQB1. For the remaining loci moderate and significant deviations were detected at HLA-C, HLA-B, HLA-DRB3/4/5, HLA-DPA1 and HLA-DPB1 loci mostly from population substructures. Unique 4-field associations were observed among alleles at 2 loci and haplotypes extending large intervals that were not apparent in results obtained using testing methodologies with limited sequence coverage and phasing. The high diversity at HLA-DPA1 results from detection of intron variants of otherwise well conserved protein sequences. It may be speculated that divergence in exon sequences may be negatively selected. Our data provides a valuable reference source for future population studies that may allow for precise fine mapping of coding and non-coding sequences determining disease susceptibility and allo-immunogenicity.


PROVIDER: S-EPMC6778037 | BioStudies | 2019-01-01

REPOSITORIES: biostudies

Similar Datasets

2018-01-01 | S-EPMC7011615 | BioStudies
2010-01-01 | S-EPMC2910726 | BioStudies
| S-EPMC7276866 | BioStudies
2019-01-01 | S-EPMC6774514 | BioStudies
2019-01-01 | S-EPMC6599556 | BioStudies
2013-01-01 | S-EPMC3634360 | BioStudies
2017-01-01 | S-EPMC5519768 | BioStudies
2020-01-01 | S-EPMC7272581 | BioStudies
2017-01-01 | S-EPMC5593890 | BioStudies
2018-01-01 | S-EPMC5788571 | BioStudies