Evaluating the contribution of rare variants to type 2 diabetes and related traits using pedigrees.
ABSTRACT: A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant cis-expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.
Project description:The genetic architecture of human diseases governs the success of genetic mapping and the future of personalized medicine. Although numerous studies have queried the genetic basis of common disease, contradictory hypotheses have been advocated about features of genetic architecture (for example, the contribution of rare versus common variants). We developed an integrated simulation framework, calibrated to empirical data, to enable the systematic evaluation of such hypotheses. For type 2 diabetes (T2D), two simple parameters--(i) the target size for causal mutation and (ii) the coupling between selection and phenotypic effect--define a broad space of architectures. Whereas extreme models are excluded by the combination of epidemiology, linkage and genome-wide association studies, many models remain consistent, including those where rare variants explain either little (<25%) or most (>80%) of T2D heritability. Ongoing sequencing and genotyping studies will further constrain the space of possible architectures, but very large samples (for example, >250,000 unselected individuals) will be required to localize most of the heritability underlying T2D and other traits characterized by these models.
Project description:Recently, multiple studies have performed whole-exome or whole-genome sequencing to identify groups of rare variants associated with complex traits and diseases. They have primarily utilized case-control study designs that often require thousands of individuals to reach acceptable statistical power. Family-based studies can be more powerful because a rare variant can be enriched in an extended pedigree and segregate with the phenotype. Although many methods have been proposed for using family data to discover rare variants involved in a disease, a majority of them focus on a specific pedigree structure and are designed to analyze either binary or continuously measured outcomes. In this article, we propose RareIBD, a general and powerful approach to identifying rare variants involved in disease susceptibility. Our method can be applied to large extended families of arbitrary structure, including pedigrees with only affected individuals. The method accommodates both binary and quantitative traits. A series of simulation experiments suggest that RareIBD is a powerful test that outperforms existing approaches. In addition, our method accounts for individuals in top generations, which are not usually genotyped in extended families. In contrast to available statistical tests, RareIBD generates accurate p values even when genetic data from these individuals are missing. We applied RareIBD, as well as other methods, to two extended family datasets generated by different genotyping technologies and representing different ethnicities. The analysis of real data confirmed that RareIBD is the only method that properly controls type I error.
Project description:Current evidence from case/control studies indicates that genetic risk for psychiatric disorders derives primarily from numerous common variants, each with a small phenotypic impact. The literature describing apparent segregation of bipolar disorder (BP) in numerous multigenerational pedigrees suggests that, in such families, large-effect inherited variants might play a greater role. To identify roles of rare and common variants on BP, we conducted genetic analyses in 26 Colombia and Costa Rica pedigrees ascertained for bipolar disorder 1 (BP1), the most severe and heritable form of BP. In these pedigrees, we performed microarray SNP genotyping of 838 individuals and high-coverage whole-genome sequencing of 449 individuals. We compared polygenic risk scores (PRS), estimated using the latest BP1 genome-wide association study (GWAS) summary statistics, between BP1 individuals and related controls. We also evaluated whether BP1 individuals had a higher burden of rare deleterious single-nucleotide variants (SNVs) and rare copy number variants (CNVs) in a set of genes related to BP1. We found that compared with unaffected relatives, BP1 individuals had higher PRS estimated from BP1 GWAS statistics (P?=?0.001?~?0.007) and displayed modest increase in burdens of rare deleterious SNVs (P?=?0.047) and rare CNVs (P?=?0.002?~?0.033) in genes related to BP1. We did not observe rare variants segregating in the pedigrees. These results suggest that small-to-moderate effect rare and common variants are more likely to contribute to BP1 risk in these extended pedigrees than a few large-effect rare variants.
Project description:Genome-wide association studies have revealed that common noncoding variants in MTNR1B (encoding melatonin receptor 1B, also known as MT(2)) increase type 2 diabetes (T2D) risk(1,2). Although the strongest association signal was highly significant (P < 1 × 10(-20)), its contribution to T2D risk was modest (odds ratio (OR) of ?1.10-1.15)(1-3). We performed large-scale exon resequencing in 7,632 Europeans, including 2,186 individuals with T2D, and identified 40 nonsynonymous variants, including 36 very rare variants (minor allele frequency (MAF) <0.1%), associated with T2D (OR = 3.31, 95% confidence interval (CI) = 1.78-6.18; P = 1.64 × 10(-4)). A four-tiered functional investigation of all 40 mutants revealed that 14 were non-functional and rare (MAF < 1%), and 4 were very rare with complete loss of melatonin binding and signaling capabilities. Among the very rare variants, the partial- or total-loss-of-function variants but not the neutral ones contributed to T2D (OR = 5.67, CI = 2.17-14.82; P = 4.09 × 10(-4)). Genotyping the four complete loss-of-function variants in 11,854 additional individuals revealed their association with T2D risk (8,153 individuals with T2D and 10,100 controls; OR = 3.88, CI = 1.49-10.07; P = 5.37 × 10(-3)). This study establishes a firm functional link between MTNR1B and T2D risk.
Project description:The large-scale genome-wide association studies conducted so far identified numerous allelic variants associated with type 2 diabetes (T2D), coronary heart disease (CHD) and related cardiometabolic traits. Many T2D- and some CHD-risk loci are also linked with metabolic traits that are hallmarks of insulin resistance (lipid profile, abdominal adiposity). Chromosome 9p21.3 and 2q36.3 are the most consistently replicated loci appearing to share genetic risk for both T2D and CHD. Although many glucose- or insulin-related trait variants are also linked with T2D risk, none of them is associated with CHD. Hence, while T2D and CHD are strongly clinically linked together, further ongoing analyses are needed to clarify the existence of a shared underlying genetic signature of these complex traits. The present review summarizes an updated picture of T2D-CHD genetics as of 2013, aiming to provide a platform for targeted studies dissecting the contribution of genetics to the phenotypic heterogeneity of T2D and CHD.
Project description:DNA sequence variants are major components of the "causal field" for virtually all medical phenotypes, whether single gene familial disorders or complex traits without a clear familial aggregation. The causal variants in single gene disorders are necessary and sufficient to impart large effects. In contrast, complex traits are attributable to a much more complicated network of contributory components that in aggregate increase the probability of disease. The conventional approach to identification of the causal variants for single gene disorders is genetic linkage. However, it does not offer sufficient resolution to map the causal genes in small families or sporadic cases. The approach to genetic studies of complex traits entails candidate gene or genome-wide association studies. Genome-wide association studies provide an unbiased survey of the effects of common genetic variants (common disease-common variant hypothesis). Genome-wide association studies have led to identification of a large number of alleles for various cardiovascular diseases. However, common alleles account for a relatively small fraction of the total heritability of the traits. Accordingly, the focus has shifted toward identification of rare variants that might impart larger effect sizes (rare variant-common disease hypothesis). This shift is made feasible by recent advances in massively parallel DNA sequencing platforms, which afford the opportunity to identify virtually all common as well as rare alleles in individuals. In this review, we discuss various strategies that are used to delineate the genetic contribution to medically important cardiovascular phenotypes, emphasizing the utility of the new deep sequencing approaches.
Project description:Peroxisome proliferator-activated receptor gamma (PPARG) is a master transcriptional regulator of adipocyte differentiation and a canonical target of antidiabetic thiazolidinedione medications. In rare families, loss-of-function (LOF) mutations in PPARG are known to cosegregate with lipodystrophy and insulin resistance; in the general population, the common P12A variant is associated with a decreased risk of type 2 diabetes (T2D). Whether and how rare variants in PPARG and defects in adipocyte differentiation influence risk of T2D in the general population remains undetermined. By sequencing PPARG in 19,752 T2D cases and controls drawn from multiple studies and ethnic groups, we identified 49 previously unidentified, nonsynonymous PPARG variants (MAF < 0.5%). Considered in aggregate (with or without computational prediction of functional consequence), these rare variants showed no association with T2D (OR = 1.35; P = 0.17). The function of the 49 variants was experimentally tested in a novel high-throughput human adipocyte differentiation assay, and nine were found to have reduced activity in the assay. Carrying any of these nine LOF variants was associated with a substantial increase in risk of T2D (OR = 7.22; P = 0.005). The combination of large-scale DNA sequencing and functional testing in the laboratory reveals that approximately 1 in 1,000 individuals carries a variant in PPARG that reduces function in a human adipocyte differentiation assay and is associated with a substantial risk of T2D.
Project description:In the last two decades, complex traits have become the main focus of genetic studies. The hypothesis that both rare and common variants are associated with complex traits is increasingly being discussed. Family-based association studies using relatively large pedigrees are suitable for both rare and common variant identification. Because of the high cost of sequencing technologies, imputation methods are important for increasing the amount of information at low cost. A recent family-based imputation method, Genotype Imputation Given Inheritance (GIGI), is able to handle large pedigrees and accurately impute rare variants, but does less well for common variants where population-based methods perform better. Here, we propose a flexible approach to combine imputation data from both family- and population-based methods. We also extend the Sequence Kernel Association Test for Rare and Common variants (SKAT-RC), originally proposed for data from unrelated subjects, to family data in order to make use of such imputed data. We call this extension "famSKAT-RC." We compare the performance of famSKAT-RC and several other existing burden and kernel association tests. In simulated pedigree sequence data, our results show an increase of imputation accuracy from use of our combining approach. Also, they show an increase of power of the association tests with this approach over the use of either family- or population-based imputation methods alone, in the context of rare and common variants. Moreover, our results show better performance of famSKAT-RC compared to the other considered tests, in most scenarios investigated here.
Project description:OBJECTIVE:The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs). METHOD:The family sample included 2,461 individuals from 631 pedigrees (581 in the primary European-ancestry analyses). Association was tested for single SNPs and genetic pathways. Polygenic scores based on family study results were used to predict case-control status in the Schizophrenia Psychiatric GWAS Consortium (PGC) data set, and consistency of direction of effect with the family study was determined for top SNPs in the PGC GWAS analysis. Within-family segregation was examined for schizophrenia-associated rare CNVs. RESULTS:No genome-wide significant associations were observed for single SNPs or for pathways. PGC case and control subjects had significantly different genome-wide polygenic scores (computed by weighting their genotypes by log-odds ratios from the family study) (best p=10(-17), explaining 0.4% of the variance). Family study and PGC analyses had consistent directions for 37 of the 58 independent best PGC SNPs (p=0.024). The overall frequency of CNVs in regions with reported associations with schizophrenia (chromosomes 1q21.1, 15q13.3, 16p11.2, and 22q11.2 and the neurexin-1 gene [NRXN1]) was similar to previous case-control studies. NRXN1 deletions and 16p11.2 duplications (both of which were transmitted from parents) and 22q11.2 deletions (de novo in four cases) did not segregate with schizophrenia in families. CONCLUSIONS:Many common SNPs are likely to contribute to schizophrenia risk, with substantial overlap in genetic risk factors between multiply affected families and cases in large case-control studies. Our findings are consistent with a role for specific CNVs in disease pathogenesis, but the partial segregation of some CNVs with schizophrenia suggests that researchers should exercise caution in using them for predictive genetic testing until their effects in diverse populations have been fully studied.
Project description:BACKGROUND:We have previously described 19 pedigrees with apparent lamin (LMNA)-related dilated cardiomyopathy (DCM) manifesting in affected family members across multiple generations. In 6 of 19 families, at least 1 individual with idiopathic DCM did not carry the family's LMNA variant. We hypothesized that additional genetic cause may underlie DCM in these families. METHODS:Affected family members underwent exome sequencing to identify additional genetic cause of DCM in the 6 families with nonsegregating LMNA variants. RESULTS:In 5 of 6 pedigrees, we identified at least 1 additional rare variant in a known DCM gene that could plausibly contribute to disease in the LMNA variant-negative individuals. Bilineal inheritance was clear or presumed to be present in 3 of 5 families and was possible in the remaining 2. At least 1 individual with a LMNA variant also carried a variant in an additional identified DCM gene in each family. Using a multivariate linear mixed model for quantitative traits, we demonstrated that the presence of these additional variants was associated with a more severe phenotype after adjusting for sex, age, and the presence/absence of the family's nonsegregating LMNA variant. CONCLUSIONS:Our data support DCM as a genetically heterogeneous disease with, at times, multigene causation. Although the frequency of DCM resulting from multigenic cause is uncertain, our data suggest it may be higher than previously anticipated.