Phenome-wide association study (PheWAS) for detection of pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network.
ABSTRACT: Using a phenome-wide association study (PheWAS) approach, we comprehensively tested genetic variants for association with phenotypes available for 70,061 study participants in the Population Architecture using Genomics and Epidemiology (PAGE) network. Our aim was to better characterize the genetic architecture of complex traits and identify novel pleiotropic relationships. This PheWAS drew on five population-based studies representing four major racial/ethnic groups (European Americans (EA), African Americans (AA), Hispanics/Mexican-Americans, and Asian/Pacific Islanders) in PAGE, each site with measurements for multiple traits, associated laboratory measures, and intermediate biomarkers. A total of 83 single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) were genotyped across two or more PAGE study sites. Comprehensive tests of association, stratified by race/ethnicity, were performed, encompassing 4,706 phenotypes mapped to 105 phenotype-classes, and association results were compared across study sites. A total of 111 PheWAS results had significant associations for two or more PAGE study sites with consistent direction of effect with a significance threshold of p<0.01 for the same racial/ethnic group, SNP, and phenotype-class. Among results identified for SNPs previously associated with phenotypes such as lipid traits, type 2 diabetes, and body mass index, 52 replicated previously published genotype-phenotype associations, 26 represented phenotypes closely related to previously known genotype-phenotype associations, and 33 represented potentially novel genotype-phenotype associations with pleiotropic effects. The majority of the potentially novel results were for single PheWAS phenotype-classes, for example, for CDKN2A/B rs1333049 (previously associated with type 2 diabetes in EA) a PheWAS association was identified for hemoglobin levels in AA. Of note, however, GALNT2 rs2144300 (previously associated with high-density lipoprotein cholesterol levels in EA) had multiple potentially novel PheWAS associations, with hypertension related phenotypes in AA and with serum calcium levels and coronary artery disease phenotypes in EA. PheWAS identifies associations for hypothesis generation and exploration of the genetic architecture of complex traits.
Project description:We performed a hypothesis-generating phenome-wide association study (PheWAS) to identify and characterize cross-phenotype associations, where one SNP is associated with two or more phenotypes, between thousands of genetic variants assayed on the Metabochip and hundreds of phenotypes in 5,897 African Americans as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study. The PAGE I study was a National Human Genome Research Institute-funded collaboration of four study sites accessing diverse epidemiologic studies genotyped on the Metabochip, a custom genotyping chip that has dense coverage of regions in the genome previously associated with cardio-metabolic traits and outcomes in mostly European-descent populations. Here we focus on identifying novel phenome-genome relationships, where SNPs are associated with more than one phenotype. To do this, we performed a PheWAS, testing each SNP on the Metabochip for an association with up to 273 phenotypes in the participating PAGE I study sites. We identified 133 putative pleiotropic variants, defined as SNPs associated at an empirically derived p-value threshold of p<0.01 in two or more PAGE study sites for two or more phenotype classes. We further annotated these PheWAS-identified variants using publicly available functional data and local genetic ancestry. Amongst our novel findings is SPARC rs4958487, associated with increased glucose levels and hypertension. SPARC has been implicated in the pathogenesis of diabetes and is also known to have a potential role in fibrosis, a common consequence of multiple conditions including hypertension. The SPARC example and others highlight the potential that PheWAS approaches have in improving our understanding of complex disease architecture by identifying novel relationships between genetic variants and an array of common human phenotypes.
Project description:We performed a Phenome-wide association study (PheWAS) utilizing diverse genotypic and phenotypic data existing across multiple populations in the National Health and Nutrition Examination Surveys (NHANES), conducted by the Centers for Disease Control and Prevention (CDC), and accessed by the Epidemiological Architecture for Genes Linked to Environment (EAGLE) study. We calculated comprehensive tests of association in Genetic NHANES using 80 SNPs and 1,008 phenotypes (grouped into 184 phenotype classes), stratified by race-ethnicity. Genetic NHANES includes three surveys (NHANES III, 1999-2000, and 2001-2002) and three race-ethnicities: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We identified 69 PheWAS associations replicating across surveys for the same SNP, phenotype-class, direction of effect, and race-ethnicity at p<0.01, allele frequency >0.01, and sample size >200. Of these 69 PheWAS associations, 39 replicated previously reported SNP-phenotype associations, 9 were related to previously reported associations, and 21 were novel associations. Fourteen results had the same direction of effect across more than one race-ethnicity: one result was novel, 11 replicated previously reported associations, and two were related to previously reported results. Thirteen SNPs showed evidence of pleiotropy. We further explored results with gene-based biological networks, contrasting the direction of effect for pleiotropic associations across phenotypes. One PheWAS result was ABCG2 missense SNP rs2231142, associated with uric acid levels in both non-Hispanic whites and Mexican Americans, protoporphyrin levels in non-Hispanic whites and Mexican Americans, and blood pressure levels in Mexican Americans. Another example was SNP rs1800588 near LIPC, significantly associated with the novel phenotypes of folate levels (Mexican Americans), vitamin E levels (non-Hispanic whites) and triglyceride levels (non-Hispanic whites), and replication for cholesterol levels. The results of this PheWAS show the utility of this approach for exposing more of the complex genetic architecture underlying multiple traits, through generating novel hypotheses for future research.
Project description:We conducted an electronic health record (EHR)-based phenome-wide association study (PheWAS) to discover pleiotropic effects of variants in three lipoprotein metabolism genes PCSK9, APOB, and LDLR. Using high-density genotype data, we tested the associations of variants in the three genes with 1232 EHR-derived binary phecodes in 51,700 European-ancestry (EA) individuals and 585 phecodes in 10,276 African-ancestry (AA) individuals; 457 PCSK9, 730 APOB, and 720 LDLR variants were filtered by imputation quality (r 2 > 0.4), minor allele frequency (>1%), linkage disequilibrium (r 2 < 0.3), and association with LDL-C levels, yielding a set of two PCSK9, three APOB, and five LDLR variants in EA but no variants in AA. Cases and controls were defined for each phecode using the PheWAS package in R. Logistic regression assuming an additive genetic model was used with adjustment for age, sex, and the first two principal components. Significant associations were tested in additional cohorts from Vanderbilt University (n = 29,713), the Marshfield Clinic Personalized Medicine Research Project (n = 9562), and UK Biobank (n = 408,455). We identified one PCSK9, two APOB, and two LDLR variants significantly associated with an examined phecode. Only one of the variants was associated with a non-lipid disease phecode, ("myopia") but this association was not significant in the replication cohorts. In this large-scale PheWAS we did not find LDL-C-related variants in PCSK9, APOB, and LDLR to be associated with non-lipid-related phenotypes including diabetes, neurocognitive disorders, or cataracts.
Project description:Purpose of Review:Over many decades, researchers have been designing studies to investigate the relationship between genotypes and phenotypes to gain an understanding about the effect of genetics on disease. Recently, a high-throughput approach called phenome-wide associations studies (PheWAS) have been extensively used to identify associations between genetic variants and many diseases and traits simultaneously. In this review, we describe the value of PheWAS along with methodological issues and challenges in interpretation for current applications of PheWAS. Recent findings:PheWAS have uncovered a paradigm to identify new associations for genetic loci across many diseases. The application of PheWAS have been effective with phenotype data from electronic health records, epidemiological studies, and clinical trials data. Summary:The key strength of a PheWAS is to identify the association of one or more genetic variants with multiple phenotypes, which can showcase interconnections among the phenotypes due to shared genetic associations. While the PheWAS approach appears promising, there are a number of challenges that need to be addressed to provide additional robustness to PheWAS findings.
Project description:Mitochondria play a critical role in the cell and have DNA independent of the nuclear genome. There is much evidence that mitochondrial DNA (mtDNA) variation plays a role in human health and disease, however, this area of investigation has lagged behind research into the role of nuclear genetic variation on complex traits and phenotypic outcomes. Phenome-wide association studies (PheWAS) investigate the association between a wide range of traits and genetic variation. To date, this approach has not been used to investigate the relationship between mtDNA variants and phenotypic variation. Herein, we describe the development of a PheWAS framework for mtDNA variants (mt-PheWAS). Using the Metabochip custom genotyping array, nuclear and mitochondrial DNA variants were genotyped in 11,519 African Americans from the Vanderbilt University biorepository, BioVU. We employed both polygenic modeling and association testing with mitochondrial single nucleotide polymorphisms (mtSNPs) to explore the relationship between mtDNA variants and a group of eight cardiovascular-related traits obtained from de-identified electronic medical records within BioVU.Using polygenic modeling we found evidence for an effect of mtDNA variation on total cholesterol and type 2 diabetes (T2D). After performing comprehensive mitochondrial single SNP associations, we identified an increased number of single mtSNP associations with total cholesterol and T2D compared to the other phenotypes examined, which did not have more significantly associated SNPs than would be expected by chance. Among the mtSNPs significantly associated with T2D we identified variant mt16189, an association previously reported only in Asian and European-descent populations.Our replication of previous findings and identification of novel associations from this initial study suggest that our mt-PheWAS approach is robust for investigating the relationship between mitochondrial genetic variation and a range of phenotypes, providing a framework for future mt-PheWAS.
Project description:Most phenome-wide association studies (PheWASs) to date have used a small to moderate number of SNPs for association with phenotypic data. We performed a large-scale single-cohort PheWAS, using electronic health record (EHR)-derived case-control status for 541 diagnoses using International Classification of Disease version 9 (ICD-9) codes and 25 median clinical laboratory measures. We calculated associations between these diagnoses and traits with ?630,000 common frequency SNPs with minor allele frequency > 0.01 for 38,662 individuals. In this landscape PheWAS, we explored results within diseases and traits, comparing results to those previously reported in genome-wide association studies (GWASs), as well as previously published PheWASs. We further leveraged the context of functional impact from protein-coding to regulatory regions, providing a deeper interpretation of these associations. The comprehensive nature of this PheWAS allows for novel hypothesis generation, the identification of phenotypes for further study for future phenotypic algorithm development, and identification of cross-phenotype associations.
Project description:Protein tyrosine phosphatase non-receptor type 22 (PTPN22) is a negative regulator of T-cell activation associated with several autoimmune diseases, including systemic lupus erythematosus (SLE). Missense rs2476601 is associated with SLE in individuals with European ancestry. Since the rs2476601 risk allele frequency differs dramatically across ethnicities, we assessed robustness of PTPN22 association with SLE and its clinical sub-phenotypes across four ethnically diverse populations. Ten SNPs were genotyped in 8220 SLE cases and 7369 controls from in European-Americans (EA), African-Americans (AA), Asians (AS), and Hispanics (HS). We performed imputation-based association followed by conditional analysis to identify independent associations. Significantly associated SNPs were tested for association with SLE clinical sub-phenotypes, including autoantibody profiles. Multiple testing was accounted for by using false discovery rate. We successfully imputed and tested allelic association for 107 SNPs within the PTPN22 region and detected evidence of ethnic-specific associations from EA and HS. In EA, the strongest association was at rs2476601 (P = 4.7 × 10(-9), OR = 1.40 (95% CI = 1.25-1.56)). Independent association with rs1217414 was also observed in EA, and both SNPs are correlated with increased European ancestry. For HS imputed intronic SNP, rs3765598, predicted to be a cis-eQTL, was associated (P = 0.007, OR = 0.79 and 95% CI = 0.67-0.94). No significant associations were observed in AA or AS. Case-only analysis using lupus-related clinical criteria revealed differences between EA SLE patients positive for moderate to high titers of IgG anti-cardiolipin (aCL IgG >20) versus negative aCL IgG at rs2476601 (P = 0.012, OR = 1.65). Association was reinforced when these cases were compared to controls (P = 2.7 × 10(-5), OR = 2.11). Our results validate that rs2476601 is the most significantly associated SNP in individuals with European ancestry. Additionally, rs1217414 and rs3765598 may be associated with SLE. Further studies are required to confirm the involvement of rs2476601 with aCL IgG.
Project description:HMG-CoA reductase (HMGCR) is an enzyme involved in cholesterol synthesis. To investigate the contribution of the HMGCR gene to lipids and lipoprotein subfractions in different ethnicities, we performed an association study in the Multi-Ethnic Study of Atherosclerosis (MESA). In total, 2,444 MESA subjects [597 African-Americans (AA), 627 Chinese-Americans (CHA), 612 European-Americans (EA), and 608 Hispanic-Americans (HA)] without statin use were included. Participants had measurements of blood pressure, anthropometry, and fasting blood samples. Subjects were genotyped for 10 single nucleotide polymorphisms (SNPs). After excluding SNPs with minor allele frequency <5%, a single block was constructed. The most frequent haplotype was H1 (41-56%) in all ethnic groups except AA (H2a, 44.9%). Lower triglyceride level was associated with the H2a haplotype in AA and H2 in HA. In HA, H4 carriers had higher levels of triglyceride and small low-density lipoprotein (s-LDL), and lower high-density lipoprotein cholesterol (HDL-c), while carriers with H7 or H8 had associations with these traits in the opposite direction. No significant association was discovered in both CHA and EA. The total variation for triglyceride that could be explained by H2 alone was 2.6% in HA and 1.4% in AA. In conclusion, HMGCR gene variation is associated with multiple lipid/lipoprotein traits, especially with triglyceride, s-LDL, and HDL-c. The impact of the genetic variance is modest and differs greatly among ethnicities.
Project description:Blood pressure (BP) is significantly influenced by genetic factors; however, less than 3% of the BP variance has been accounted for by variants identified from genome-wide association studies (GWAS) of primarily European-descent cohorts. Other genetic influences, including gene-environment (GxE) interactions, may explain more of the unexplained variance in BP. African Americans (AA) have a higher prevalence and earlier age of onset of hypertension (HTN) as compared with European Americans (EA); responses to anti-hypertensive drugs vary across race groups. To examine potential interactions between the use of loop diuretics and HTN traits, we analyzed systolic (SBP) and diastolic (DBP) blood BP from 1222 AA and 1231 EA participants in the Hypertension Genetic Epidemiology Network (HyperGEN). Population-specific score tests were used to test associations of SBP and DBP, using a panel of genotyped and imputed single nucleotide polymorphisms (SNPs) for AA (2.9 million SNPs) and EA (2.3 million SNPs). Several promising loci were identified through gene-loop diuretic interactions, although no SNP reached genome-wide significance after adjustment for genomic inflation. In AA, SNPs in or near the genes NUDT12, CHL1, GRIA1, CACNB2, and PYHIN1 were identified for SBP, and SNPs near ID3 were identified for DBP. For EA, promising SNPs for SBP were identified in ESR1 and for DBP in SPATS2L and EYA2. Among these SNPs, none were common across phenotypes or population groups. Biologic plausibility exists for many of the identified genes, suggesting that these are candidate genes for regulation of BP and/or anti-hypertensive drug response. The lack of genome-wide significance is understandable in this small study employing gene-drug interactions. These findings provide a set of prioritized SNPs/candidate genes for future studies in HTN. Studies in more diversified population samples may help identify previously missed variants.
Project description:BACKGROUND:Multiple genome-wide association studies (GWAS) within European populations have implicated common genetic variants associated with insulin and glucose concentrations. In contrast, few studies have been conducted within minority groups, which carry the highest burden of impaired glucose homeostasis and type 2 diabetes in the U.S. METHODS:As part of the 'Population Architecture using Genomics and Epidemiology (PAGE) Consortium, we investigated the association of up to 10 GWAS-identified single nucleotide polymorphisms (SNPs) in 8 genetic regions with glucose or insulin concentrations in up to 36,579 non-diabetic subjects including 23,323 European Americans (EA) and 7,526 African Americans (AA), 3,140 Hispanics, 1,779 American Indians (AI), and 811 Asians. We estimated the association between each SNP and fasting glucose or log-transformed fasting insulin, followed by meta-analysis to combine results across PAGE sites. RESULTS:Overall, our results show that 9/9 GWAS SNPs are associated with glucose in EA (p = 0.04 to 9 × 10-15), versus 3/9 in AA (p= 0.03 to 6 × 10-5), 3/4 SNPs in Hispanics, 2/4 SNPs in AI, and 1/2 SNPs in Asians. For insulin we observed a significant association with rs780094/GCKR in EA, Hispanics and AI only. CONCLUSIONS:Generalization of results across multiple racial/ethnic groups helps confirm the relevance of some of these loci for glucose and insulin metabolism. Lack of association in non-EA groups may be due to insufficient power, or to unique patterns of linkage disequilibrium.