Self-reported race/ethnicity in the age of genomic research: its potential impact on understanding health disparities.
ABSTRACT: This review explores the limitations of self-reported race, ethnicity, and genetic ancestry in biomedical research. Various terminologies are used to classify human differences in genomic research including race, ethnicity, and ancestry. Although race and ethnicity are related, race refers to a person's physical appearance, such as skin color and eye color. Ethnicity, on the other hand, refers to communality in cultural heritage, language, social practice, traditions, and geopolitical factors. Genetic ancestry inferred using ancestry informative markers (AIMs) is based on genetic/genomic data. Phenotype-based race/ethnicity information and data computed using AIMs often disagree. For example, self-reporting African Americans can have drastically different levels of African or European ancestry. Genetic analysis of individual ancestry shows that some self-identified African Americans have up to 99% of European ancestry, whereas some self-identified European Americans have substantial admixture from African ancestry. Similarly, African ancestry in the Latino population varies between 3% in Mexican Americans to 16% in Puerto Ricans. The implication of this is that, in African American or Latino populations, self-reported ancestry may not be as accurate as direct assessment of individual genomic information in predicting treatment outcomes. To better understand human genetic variation in the context of health disparities, we suggest using "ancestry" (or biogeographical ancestry) to describe actual genetic variation, "race" to describe health disparity in societies characterized by racial categories, and "ethnicity" to describe traditions, lifestyle, diet, and values. We also suggest using ancestry informative markers for precise characterization of individuals' biological ancestry. Understanding the sources of human genetic variation and the causes of health disparities could lead to interventions that would improve the health of all individuals.
Project description:PURPOSE:The Vanderbilt DNA Databank (BioVU) is a biorepository that currently contains >80,000 DNA samples linked to electronic medical records. Although BioVU is a valuable source of samples and phenotypes for genetic association studies, it is unclear whether the administratively assigned race/ethnicity in BioVU can accurately describe and be used as a proxy for genetic ancestry. METHODS:We genotyped 360 single nucleotide polymorphisms on the Illumina DNA Test Panel containing ancestry informative markers in 1910 BioVU samples with observer-reported ancestry and 384 samples from the Multiple Sclerosis Genetics Group with self-reported ancestry. Genetic ancestry was inferred for all individuals using Structure 2.2. RESULTS:More than 98% of observer-reported European Americans were genetically inferred to have at least 60% European ancestry. Ninety-three percent of observer-reported African Americans were genetically inferred to be predominantly of African ancestry. We determined that the concordance of observer-reported race/ethnicity and inferred genetic ancestry was not significantly different from that of self-reported race/ethnicity in either population (P = 0.09 and 0.94 in European Americans and African Americans, respectively). CONCLUSIONS:Observer-reported race/ethnicity for European Americans and African Americans approximates genetic ancestry as well as self-reported race/ethnicity, making biorepositories linked to electronic medical records such as BioVU a viable source of DNA samples for future large-scale genetic association studies.
Project description:Family history and African-American race are important risk factors for both prostate cancer (CaP) incidence and aggressiveness. When studying complex diseases such as CaP that have a heritable component, chances of finding true disease susceptibility alleles can be increased by accounting for genetic ancestry within the population investigated. Race, ethnicity and ancestry were studied in a geographically diverse cohort of men with newly diagnosed CaP.Individual ancestry (IA) was estimated in the population-based North Carolina and Louisiana Prostate Cancer Project (PCaP), a cohort of 2,106 incident CaP cases (2063 with complete ethnicity information) comprising roughly equal numbers of research subjects reporting as Black/African American (AA) or European American/Caucasian/Caucasian American/White (EA) from North Carolina or Louisiana. Mean genome wide individual ancestry estimates of percent African, European and Asian were obtained and tested for differences by state and ethnicity (Cajun and/or Creole and Hispanic/Latino) using multivariate analysis of variance models. Principal components (PC) were compared to assess differences in genetic composition by self-reported race and ethnicity between and within states.Mean individual ancestries differed by state for self-reporting AA (p = 0.03) and EA (p = 0.001). This geographic difference attenuated for AAs who answered "no" to all ethnicity membership questions (non-ethnic research subjects; p = 0.78) but not EA research subjects, p = 0.002. Mean ancestry estimates of self-identified AA Louisiana research subjects for each ethnic group; Cajun only, Creole only and both Cajun and Creole differed significantly from self-identified non-ethnic AA Louisiana research subjects. These ethnicity differences were not seen in those who self-identified as EA.Mean IA differed by race between states, elucidating a potential contributing factor to these differences in AA research participants: self-reported ethnicity. Accurately accounting for genetic admixture in this cohort is essential for future analyses of the genetic and environmental contributions to CaP.
Project description:Self-reported ancestry, genetically determined ancestry, and APOL1 polymorphisms are associated with variation in kidney function and related disease risk, but the relative importance of these factors remains unclear. We estimated the global proportion of African ancestry for 9048 individuals at Mount Sinai Medical Center in Manhattan (3189 African Americans, 1721 European Americans, and 4138 Hispanic/Latino Americans by self-report) using genome-wide genotype data. CKD-EPI eGFR and genotypes of three APOL1 coding variants were available. In admixed African Americans and Hispanic/Latino Americans, serum creatinine values increased as African ancestry increased (per 10% increase in African ancestry, creatinine values increased 1% in African Americans and 0.9% in Hispanic/Latino Americans; P?1x10(-7)). eGFR was likewise significantly associated with African genetic ancestry in both populations. In contrast, APOL1 risk haplotypes were significantly associated with CKD, eGFR<45 ml/min per 1.73 m(2), and ESRD, with effects increasing with worsening disease states and the contribution of genetic African ancestry decreasing in parallel. Using genetic ancestry in the eGFR equation to reclassify patients as black on the basis of ?50% African ancestry resulted in higher eGFR for 14.7% of Hispanic/Latino Americans and lower eGFR for 4.1% of African Americans, affecting CKD staging in 4.3% and 1% of participants, respectively. Reclassified individuals had electrolyte values consistent with their newly assigned CKD stage. In summary, proportion of African ancestry was significantly associated with normal-range creatinine and eGFR, whereas APOL1 risk haplotypes drove the associations with CKD. Recalculation of eGFR on the basis of genetic ancestry affected CKD staging and warrants additional investigation.
Project description:BACKGROUND:Cigarette smoking is the major cause of chronic obstructive pulmonary disease and emphysema. Recent studies suggest that susceptibility to cigarette smoke may vary by race/ethnicity; however, they were generally small and relied on self-reported race/ethnicity. OBJECTIVE:To test the hypothesis that relationships of smoking to lung function and per cent emphysema differ by genetic ancestry and self-reported race/ethnicity among Caucasians, African-Americans, Hispanics and Chinese-Americans. DESIGN:Cross-sectional population-based study of adults age 45-84 years in the USA. MEASUREMENTS:Principal components of genetic ancestry and continental ancestry estimated from one million genome-wide single nucleotide polymorphisms; pack-years of smoking; spirometry measured for 3344 participants; and per cent emphysema on computed tomography for 8224 participants. RESULTS:The prevalence of ever-smoking was: Caucasians, 57.6%; African-Americans, 56.4%; Hispanics, 46.7%; and Chinese-Americans, 26.8%. Every 10 pack-years was associated with -0.73% (95% CI -0.90% to -0.56%) decrement in the forced expiratory volume in 1 s to forced vital capacity (FEV1 to FVC) and a 0.23% (95% CI 0.08% to 0.38%) increase in per cent emphysema. There was no evidence that relationships of pack-years to the FEV1 to FVC, airflow obstruction and per cent emphysema varied by genetic ancestry (all p>0.10), self-reported race/ethnicity (all p>0.10) or, among African-Americans, African ancestry. There were small differences in relationships of pack-years to the FEV1 among male Chinese-Americans and to the FEV1 to FVC ratio with African and Native American ancestry among male Hispanics only. CONCLUSIONS:In this large cohort, there was little to no evidence that the associations of smoking to lung function and per cent emphysema differed by genetic ancestry or self-reported race/ethnicity.
Project description:We examined the relationship between genetic ancestry, socioeconomic status (SES), and lung cancer among African Americans and Latinos.We evaluated SES and genetic ancestry in a Northern California lung cancer case-control study (1998-2003) of African Americans and Latinos. Lung cancer case and control participants were frequency matched on age, gender, and race/ethnicity. We assessed case-control differences in individual admixture proportions using the 2-sample t test and analysis of covariance. Logistic regression models examined associations among genetic ancestry, socioeconomic characteristics, and lung cancer.Decreased Amerindian ancestry was associated with higher education among Latino control participants and greater African ancestry was associated with decreased education among African lung cancer case participants. Education was associated with lung cancer among both Latinos and African Americans, independent of smoking, ancestry, age, and gender. Genetic ancestry was not associated with lung cancer among African Americans.Findings suggest that socioeconomic factors may have a greater impact than genetic ancestry on lung cancer among African Americans. The genetic heterogeneity and recent dynamic migration and acculturation of Latinos complicate recruitment; thus, epidemiological analyses and findings should be interpreted cautiously.
Project description:BACKGROUND:Epidemiologic studies report that self-identified African Americans typically have higher hemostatic factor levels than do self-identified Caucasians or Hispanics. OBJECTIVE:To enhance understanding of phenotypic variation in hemostatic factor levels by race/ethnicity, we evaluated the relationship between genetic ancestry and hemostatic factor levels among Multi-Ethnic Study of Atherosclerosis (MESA) study participants. PATIENTS/METHODS:Our sample included 712 African American and 701 Hispanic men and women aged 45 to 84 years. Individual global ancestry was estimated from 199 genetic markers using STRUCTURE. Linear regression models were used to evaluate the relationship between ancestry and hemostatic factor levels, adjusting for age, gender, education, income and study site. RESULTS:Among African Americans, mean ± standard deviation (SD) ancestry was estimated as 79.9% ± 15.9% African and 20.1% ± 15.9% European. Each SD (16%) greater African ancestry was associated with 2.1% higher fibrinogen levels (P = 0.007) and 3.5% higher plasmin-antiplasmin (PAP) levels (P = 0.02). Ancestry among African Americans was not related to levels of factor (F)VIII or D-dimer. Mean ± SD estimated ancestry among Hispanics was 48.3% ± 23.8% Native American, 38.8% ± 21.9% European, and 13.0% ± 8.9% African. In Hispanics, each SD (19%) greater African ancestry was associated with 2.7% higher fibrinogen levels (P = 0.009) and 7.9% higher FVIII levels (P = 0.0002). In Hispanics, there was no relation between African ancestry and D-dimer or PAP levels, or between European ancestry and hemostatic factor levels. CONCLUSIONS:Greater African ancestry among African Americans and Hispanics was associated with higher levels of several hemostatic factors, notably fibrinogen. These results suggest that genetic heterogeneity contributes, albeit modestly, to racial/ethnic differences in hemostatic factor levels.
Project description:Lower serum vitamin D (25(OH)D) among individuals with African ancestry is attributed primarily to skin pigmentation. However, the influence of genetic polymorphisms controlling for skin melanin content has not been investigated. Therefore, we investigated differences in non-summer serum vitamin D metabolites according to self-reported race, genetic ancestry, skin reflectance and key pigmentation genes (SLC45A2 and SLC24A5).Healthy individuals reporting at least half African American or half European American heritage were frequency matched to one another on age (+/- 2 years) and sex. 176 autosomal ancestry informative markers were used to estimate genetic ancestry. Melanin index was measured by reflectance spectrometry. Serum vitamin D metabolites (25(OH)D3, 25(OH)D 2 and 24,25(OH)2D3) were determined by high performance liquid chromatography (HPLC) tandem mass spectrometry. Percent 24,25(OH)2D3 was calculated as a percent of the parent metabolite (25(OH)D3). Stepwise and backward selection regression models were used to identify leading covariates.Fifty African Americans and 50 European Americans participated in the study. Compared with SLC24A5 111(Thr) homozygotes, individuals with the SLC24A5 111(Thr/Ala) and 111(Ala/Ala) genotypes had respectively lower levels of 25(OH)D3 (23.0 and 23.8 nmol/L lower, p-dominant=0.007), and percent 24,25(OH)2D3 (4.1 and 5.2 percent lower, p-dominant=0.003), controlling for tanning bed use, vitamin D/fish oil supplement intake, race/ethnicity, and genetic ancestry. Results were similar with melanin index adjustment, and were not confounded by glucocorticoid, oral contraceptive, or statin use.The SLC24A5 111(Ala) allele was associated with lower serum vitamin 25(OH)D3 and lower percent 24,25(OH)2D3, independently from melanin index and West African genetic ancestry.
Project description:Genetic association studies can be used to identify factors that may contribute to disparities in disease evident across different racial and ethnic populations. However, such studies may not account for potential confounding if study populations are genetically heterogeneous. Racial and ethnic classifications have been used as proxies for genetic relatedness. We investigated genetic admixture and developed a questionnaire to explore variables used in constructing racial identity in two cohorts: 50 African Americans and 40 Nigerians. Genetic ancestry was determined by genotyping 107 ancestry informative markers. Ancestry estimates calculated with maximum likelihood estimation were compared with population stratification detected with principal components analysis. Ancestry was approximately 95% west African, 4% European, and 1% Native American in the Nigerian cohort and 83% west African, 15% European, and 2% Native American in the African American cohort. Therefore, self-identification as African American agreed well with inferred west African ancestry. However, the cohorts differed significantly in mean percentage west African and European ancestries (P < 0.0001) and in the variance for individual ancestry (P < or = 0.01). Among African Americans, no set of questionnaire items effectively estimated degree of west African ancestry, and self-report of a high degree of African ancestry in a three-generation family tree did not accurately predict degree of African ancestry. Our findings suggest that self-reported race and ancestry can predict ancestral clusters but do not reveal the extent of admixture. Genetic classifications of ancestry may provide a more objective and accurate method of defining homogenous populations for the investigation of specific population-disease associations.
Project description:Prior studies of lung cancer and CYP1A1/2 in African-American and Latino populations have shown inconsistent results and have not yet investigated the haplotype block structure of CYP1A1/2 or addressed potential population stratification. To investigate haplotypes in the CYP1A1/2 region and lung cancer in African-Americans and Latinos, we conducted a case-control study (1998-2003). African-Americans (n = 535) and Latinos (n = 412) were frequency matched on age, sex, and self-reported race/ethnicity. We used a custom genotyping panel containing 50 single nucleotide polymorphisms in the CYP1A1/2 region and 184 ancestry informative markers selected to have large allele frequency differences between Africans, Europeans, and Amerindians. Latinos exhibited significant haplotype main effects in two blocks even after adjusting for admixture [odds ratio (OR), 2.02; 95% confidence interval (95% CI), 1.28-3.19 and OR, 0.55; 95% CI, 0.36-0.83], but no main effects were found among African-Americans. Adjustment for admixture revealed substantial confounding by population stratification among Latinos but not African-Americans. Among Latinos and African-Americans, interactions between smoking level and haplotypes were not statistically significant. Evidence of population stratification among Latinos underscores the importance of adjusting for admixture in lung cancer association studies, particularly in Latino populations. These results suggest that a variant occurring within the CYP1A2 region may be conferring an increased risk of lung cancer in Latinos.
Project description:BACKGROUND:Differences in cardiovascular disease (CVD) burden exist among racial/ethnic groups in the United States, with African-Americans having the highest prevalence. Subclinical CVD measures have also been shown to differ by race or ethnicity. In the United States, there has been a significant intermixing among racial/ethnic groups creating admixed populations. Very little research exists on the relationship of genetic ancestry and subclinical CVD measures. METHODS AND RESULTS:These associations were investigated in 712 black and 705 Hispanic participants from the Multi-Ethnic Study of Atherosclerosis candidate gene substudy. Individual ancestry was estimated from 199 genetic markers using STRUCTURE. Associations of ancestry and coronary artery calcium (CAC) and common and internal carotid intima media thickness were evaluated using log-binomial and linear regression models. Splines indicated linear associations of ancestry with subclinical CVD measures in African-Americans but presence of threshold effects in Hispanics. Among African-Americans, each SD increase in European ancestry was associated with an 8% (95% CI, 1.02 to 1.15; P=0.01) higher CAC prevalence. Each SD increase in European ancestry was also associated with a 2% (95% CI -3.4% to -0.5%, P=0.008) lower common carotid intima media thickness in African-Americans. Among Hispanics, the highest tertile of European ancestry was associated with a 34% higher CAC prevalence (P=0.02) when compared with the lowest tertile. CONCLUSIONS:The linear association of ancestry and subclinical CVD suggests that genetic effects may be important in determining CAC and carotid intima media thickness among African-Americans. Our results also suggest that CAC and common carotid intima media thickness may be important phenotypes for further study with admixture mapping.