Project description:We show how to use reports of cancer in family members to discover additional genetic associations or confirm previous findings in genome-wide association (GWA) studies conducted in case-control, cohort, or cross-sectional studies. Our novel family history-based approach allows economical association studies for multiple cancers, without genotyping of relatives (as required in family studies), follow-up of participants (as required in cohort studies), or oversampling of specific cancer cases (as required in case-control studies). We empirically evaluate the performance of the proposed family history-based approach in studying associations with prostate and ovarian cancers, using data from GWA studies previously conducted within the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. The family history-based method may be particularly useful for investigating genetic susceptibility to rare diseases for which accruing cases may be very difficult, by using disease information from nongenotyped relatives of participants in multiple case-control and cohort studies designed primarily for other purposes.
Project description:Over the last few years, many new genetic associations have been identified by genome-wide association studies (GWAS). There are potentially many uses of these identified variants: a better understanding of disease etiology, personalized medicine, new leads for studying underlying biology, and risk prediction. Recently, there has been some skepticism regarding the prospects of risk prediction using GWAS, primarily motivated by the fact that individual effect sizes of variants associated with the phenotype are mostly small. However, there have also been arguments that many disease-associated variants have not yet been identified; hence, prospects for risk prediction may improve if more variants are included. From a risk prediction perspective, it is reasonable to average a larger number of predictors, of which some may have (limited) predictive power, and some actually may be noise. The idea being that when added together, the combined small signals results in a signal that is stronger than the noise from the unrelated predictors. We examine various aspects of the construction of models for the estimation of disease probability. We compare different methods to construct such models, to examine how implementation of cross-validation may influence results, and to examine which single nucleotide polymorphisms (SNPs) are most useful for prediction. We carry out our investigation on GWAS of the Welcome Trust Case Control Consortium. For Crohn's disease, we confirm our results on another GWAS. Our results suggest that utilizing a larger number of SNPs than those which reach genome-wide significance, for example using the lasso, improves the construction of risk prediction models.
Project description:Genome-wide association studies (GWASs) have revealed a plethora of putative susceptibility genes for Alzheimer's disease (AD). With the sole exception of the APOE gene, these AD susceptibility genes have not been unequivocally validated in independent studies. No single novel functional risk genetic variant has been identified. In this review, we evaluate recent GWASs of AD, and discuss their significance, limitations, and challenges in the investigation of the genetic spectrum of AD.
Project description:ObjectiveWe examined whether a panel of SNPs, systematically selected from genome-wide association studies (GWAS), could improve risk prediction of coronary heart disease (CHD), over-and-above conventional risk factors. These SNPs have already demonstrated reproducible associations with CHD; here we examined their use in long-term risk prediction.Study design and settingSNPs identified from meta-analyses of GWAS of CHD were tested in 840 men and women aged 55-75 from the Edinburgh Artery Study, a prospective, population-based study with 15 years of follow-up. Cox proportional hazards models were used to evaluate the addition of SNPs to conventional risk factors in prediction of CHD risk. CHD was classified as myocardial infarction (MI), coronary intervention (angioplasty, or coronary artery bypass surgery), angina and/or unspecified ischaemic heart disease as a cause of death; additional analyses were limited to MI or coronary intervention. Model performance was assessed by changes in discrimination and net reclassification improvement (NRI).ResultsThere were significant improvements with addition of 27 SNPs to conventional risk factors for prediction of CHD (NRI of 54%, P<0.001; C-index 0.671 to 0.740, P?=?0.001), as well as MI or coronary intervention, (NRI of 44%, P<0.001; C-index 0.717 to 0.750, P?=?0.256). ROC curves showed that addition of SNPs better improved discrimination when the sensitivity of conventional risk factors was low for prediction of MI or coronary intervention.ConclusionThere was significant improvement in risk prediction of CHD over 15 years when SNPs identified from GWAS were added to conventional risk factors. This effect may be particularly useful for identifying individuals with a low prognostic index who are in fact at increased risk of disease than indicated by conventional risk factors alone.
Project description:BackgroundAlzheimer's disease is a debilitating and highly heritable neurological condition. As such, genetic studies have sought to understand the genetic architecture of Alzheimer's disease since the 1990s, with successively larger genome-wide association studies (GWAS) and meta-analyses. These studies started with a small sample size of 1086 individuals in 2007, which was able to identify only the APOE locus. In 2013, the International Genomics of Alzheimer's Project (IGAP) did a meta-analysis of all existing GWAS using data from 74 046 individuals, which stood as the largest Alzheimer's disease GWAS until 2018. This meta-analysis discovered 19 susceptibility loci for Alzheimer's disease in populations of European ancestry.Recent developmentsThree new Alzheimer's disease GWAS published in 2018 and 2019, which used larger sample sizes and proxy phenotypes from biobanks, have substantially increased the number of known susceptibility loci in Alzheimer's disease to 40. The first, an updated GWAS from IGAP, included 94 437 individuals and discovered 24 susceptibility loci. Although IGAP sought to increase sample size by recruiting additional clinical cases and controls, the two other studies used parental family history of Alzheimer's disease to define proxy cases and controls in the UK Biobank for a genome-wide association by proxy, which was meta-analysed with data from GWAS of clinical Alzheimer's disease to attain sample sizes of 388 324 and 534 403 individuals. These two studies identified 27 and 29 susceptibility loci, respectively. However, the three studies were not independent because of the large overlap in their participants, and interpretation can be challenging because different variants and genes were highlighted by each study, even in the same locus. Furthermore, neither the variant with the strongest Alzheimer's disease association nor the nearest gene are necessarily causal. This situation presents difficulties for experimental studies, drug development, and other future research. WHERE NEXT?: The ultimate goal of understanding the genetic architecture of Alzheimer's disease is to characterise novel biological pathways that underly Alzheimer's disease pathogenesis and to identify novel drug targets. GWAS have successfully contributed to the characterisation of the genetic architecture of Alzheimer's disease, with the identification of 40 susceptibility loci; however, this does not equate to the discovery of 40 Alzheimer's disease genes. To identify Alzheimer's disease genes, these loci need to be mapped to variants and genes through functional genomics studies that combine annotation of variants, gene expression, and gene-based or pathway-based analyses. Such studies are ongoing and have validated several genes at Alzheimer's disease loci, but greater sample sizes and cell-type specific data are needed to map all GWAS loci.
Project description:To date more than 3700 genome-wide association studies (GWAS) have been published that look at the genetic contributions of single nucleotide polymorphisms (SNPs) to human conditions or human phenotypes. Through these studies many highly significant SNPs have been identified for hundreds of diseases or medical conditions. However, the extent to which GWAS-identified SNPs or combinations of SNP biomarkers can predict disease risk is not well known. One of the most commonly used approaches to assess the performance of predictive biomarkers is to determine the area under the receiver-operator characteristic curve (AUROC). We have developed an R package called G-WIZ to generate ROC curves and calculate the AUROC using summary-level GWAS data. We first tested the performance of G-WIZ by using AUROC values derived from patient-level SNP data, as well as literature-reported AUROC values. We found that G-WIZ predicts the AUROC with <3% error. Next, we used the summary level GWAS data from GWAS Central to determine the ROC curves and AUROC values for 569 different GWA studies spanning 219 different conditions. Using these data we found a small number of GWA studies with SNP-derived risk predictors that have very high AUROCs (>0.75). On the other hand, the average GWA study produces a multi-SNP risk predictor with an AUROC of 0.55. Detailed AUROC comparisons indicate that most SNP-derived risk predictions are not as good as clinically based disease risk predictors. All our calculations (ROC curves, AUROCs, explained heritability) are in a publicly accessible database called GWAS-ROCS (http://gwasrocs.ca). The G-WIZ code is freely available for download at https://github.com/jonaspatronjp/GWIZ-Rscript/.
Project description:Genome-wide association studies (GWAS) have gained considerable momentum over the last couple of years for the identification of novel complex disease genes. In the field of Alzheimer's disease (AD), there are currently eight published and two provisionally reported GWAS, highlighting over two dozen novel potential susceptibility loci beyond the well-established APOE association. On the basis of the data available at the time of this writing, the most compelling novel GWAS signal has been observed in GAB2 (GRB2-associated binding protein 2), followed by less consistently replicated signals in galanin-like peptide (GALP), piggyBac transposable element derived 1 (PGBD1), tyrosine kinase, non-receptor 1 (TNK1). Furthermore, consistent replication has been recently announced for CLU (clusterin, also known as apolipoprotein J). Finally, there are at least three replicated loci in hitherto uncharacterized genomic intervals on chromosomes 14q32.13, 14q31.2 and 6q24.1 likely implicating the existence of novel AD genes in these regions. In this review, we will discuss the characteristics and potential relevance to pathogenesis of the outcomes of all currently available GWAS in AD. A particular emphasis will be laid on findings with independent data in favor of the original association.
Project description:BackgroundGlaucoma is a neurodegenerative disease characterized by the progressive loss of retinal ganglion cells and optic nerve axons. According to its anatomical features, glaucoma is mainly subdivided into primary open-angle glaucoma (POAG) and primary angle-closure glaucoma (PACG). Exfoliation syndrome (XFS) and glaucoma (XFG) are characterized by the accumulation of extracellular materials in ocular tissues, particularly the lens surface and pupillary border. In addition to the two major forms of glaucoma, XFG is the most common cause of secondary open-angle glaucoma. Recent genome-wide association studies(GWASs) revealed genetic loci associated with each glaucoma subtype.MethodsReview of literatures regarding GWASs for POAG, PACG and XFS.ResultsSeveral genetic loci were found to be independently associated with POAG, PACG, and XFS by large-scale GWASs.ConclusionsGenetic studies may not only provide a better understanding of the pathophysiological mechanisms underlying the diseases, but also facilitate the development of new drugs or treatments.
Project description:Complex traits such as susceptibility to diseases are determined in part by variants at multiple genetic loci. Genome-wide association studies can identify these loci, but most phenotype-associated variants lie distal to protein-coding regions and are likely involved in regulating gene expression. Understanding how these genetic variants affect complex traits depends on the ability to predict and test the function of the genomic elements harboring them. Community efforts such as the ENCODE Project provide a wealth of data about epigenetic features associated with gene regulation. These data enable the prediction of testable functions for many phenotype-associated variants.