ABSTRACT: Gene expression as an intermediate molecular phenotype has been a focus of research interest. In particular, studies of expression quantitative trait loci (eQTL) have offered promise for understanding gene regulation through the discovery of genetic variants that explain variation in gene expression levels. Existing eQTL methods are designed for assessing the effects of common variants, but not rare variants. Here, we address the problem by establishing a novel analytical framework for evaluating the effects of rare or private variants on gene expression. Our method starts from the identification of outlier individuals that show markedly different gene expression from the majority of a population, and then reveals the contributions of private SNPs to the aberrant gene expression in these outliers. Using population-scale mRNA sequencing data, we identify outlier individuals using a multivariate approach. We find that outlier individuals are more readily detected with respect to gene sets that include genes involved in cellular regulation and signal transduction, and less likely to be detected with respect to the gene sets with genes involved in metabolic pathways and other fundamental molecular functions. Analysis of polymorphic data suggests that private SNPs of outlier individuals are enriched in the enhancer and promoter regions of corresponding aberrantly-expressed genes, suggesting a specific regulatory role of private SNPs, while the commonly-occurring regulatory genetic variants (i.e., eQTL SNPs) show little evidence of involvement. Additional data suggest that non-genetic factors may also underlie aberrant gene expression. Taken together, our findings advance a novel viewpoint relevant to situations wherein common eQTLs fail to predict gene expression when heritable, rare inter-individual variation exists. The analytical framework we describe, taking into consideration the reality of differential phenotypic robustness, may be valuable for investigating complex traits and conditions.
Project description:Identifying the downstream effects of disease-associated SNPs is challenging. To help overcome this problem, we performed expression quantitative trait locus (eQTL) meta-analysis in non-transformed peripheral blood samples from 5,311 individuals with replication in 2,775 individuals. We identified and replicated trans eQTLs for 233 SNPs (reflecting 103 independent loci) that were previously associated with complex traits at genome-wide significance. Some of these SNPs affect multiple genes in trans that are known to be altered in individuals with disease: rs4917014, previously associated with systemic lupus erythematosus (SLE), altered gene expression of C1QB and five type I interferon response genes, both hallmarks of SLE. DeepSAGE RNA sequencing showed that rs4917014 strongly alters the 3' UTR levels of IKZF1 in cis, and chromatin immunoprecipitation and sequencing analysis of the trans-regulated genes implicated IKZF1 as the causal gene. Variants associated with cholesterol metabolism and type 1 diabetes showed similar phenomena, indicating that large-scale eQTL mapping provides insight into the downstream effects of many trait-associated variants.
Project description:BACKGROUND:The mutations changing the expression level of a gene, or expression quantitative trait loci (eQTL), can be identified by testing the association between genetic variants and gene expression in multiple individuals (eQTL mapping), or by comparing the expression of the alleles in a heterozygous individual (allele specific expression or ASE analysis). The aims of the study were to find and compare ASE and local eQTL in 4 bovine RNA-sequencing (RNA-Seq) datasets, validate them in an independent ASE study and investigate if they are associated with complex trait variation. RESULTS:We present a novel method for distinguishing between ASE driven by polymorphisms in cis and parent of origin effects. We found that single nucleotide polymorphisms (SNPs) driving ASE are also often local eQTL and therefore presumably cis eQTL. These SNPs often, but not always, affect gene expression in multiple tissues and, when they do, the allele increasing expression is usually the same. However, there were systematic differences between ASE and local eQTL and between tissues and breeds. We also found that SNPs significantly associated with gene expression (p?<?0.001) were likely to influence some complex traits (p?<?0.001), which means that some mutations influence variation in complex traits by changing the expression level of genes. CONCLUSION:We conclude that ASE detects phenomenon that overlap with local eQTL, but there are also systematic differences between the SNPs discovered by the two methods. Some mutations influencing complex traits are actually eQTL and can be discovered using RNA-Seq including eQTL in the genes CAST, CAPN1, LCORL and LEPROTL1.
Project description:For many complex traits, genetic variants have been found associated. However, it is still mostly unclear through which downstream mechanism these variants cause these phenotypes. Knowledge of these intermediate steps is crucial to understand pathogenesis, while also providing leads for potential pharmacological intervention. Here we relied upon natural human genetic variation to identify effects of these variants on trans-gene expression (expression quantitative trait locus mapping, eQTL) in whole peripheral blood from 1,469 unrelated individuals. We looked at 1,167 published trait- or disease-associated SNPs and observed trans-eQTL effects on 113 different genes, of which we replicated 46 in monocytes of 1,490 different individuals and 18 in a smaller dataset that comprised subcutaneous adipose, visceral adipose, liver tissue, and muscle tissue. HLA single-nucleotide polymorphisms (SNPs) were 10-fold enriched for trans-eQTLs: 48% of the trans-acting SNPs map within the HLA, including ulcerative colitis susceptibility variants that affect plausible candidate genes AOAH and TRBV18 in trans. We identified 18 pairs of unlinked SNPs associated with the same phenotype and affecting expression of the same trans-gene (21 times more than expected, P<10(-16)). This was particularly pronounced for mean platelet volume (MPV): Two independent SNPs significantly affect the well-known blood coagulation genes GP9 and F13A1 but also C19orf33, SAMD14, VCL, and GNG11. Several of these SNPs have a substantially higher effect on the downstream trans-genes than on the eventual phenotypes, supporting the concept that the effects of these SNPs on expression seems to be much less multifactorial. Therefore, these trans-eQTLs could well represent some of the intermediate genes that connect genetic variants with their eventual complex phenotypic outcomes.
Project description:The International Genomics of Alzheimer's Project (IGAP) is a consortium for characterizing the genetic landscape of Alzheimer's disease (AD). The identified and/or confirmed 19 single-nucleotide polymorphisms (SNPs) associated with AD are located on non-coding DNA regions, and their functional impacts on AD are as yet poorly understood. We evaluated the roles of the IGAP SNPs by integrating data from many resources, based on whether the IGAP SNP was (1) a proxy for a coding SNP or (2) associated with altered mRNA transcript levels. For (1), we confirmed that 12 AD-associated coding common SNPs and five nonsynonymous rare variants are in linkage disequilibrium with the IGAP SNPs. For (2), the IGAP SNPs in CELF1 and MS4A6A were associated with expression of their neighboring genes, MYBPC3 and MS4A6A, respectively, in blood. The IGAP SNP in DSG2 was an expression quantitative trait loci (eQTL) for DLGAP1 and NETO1 in the human frontal cortex. The IGAP SNPs in ABCA7, CD2AP, and CD33 each acted as eQTL for AD-associated genes in brain. Our approach for identifying proxies and examining eQTL highlighted potentially impactful, novel gene regulatory phenomena pertinent to the AD phenotype.
Project description:Common genetic variations in the IL4 gene have been associated with asthma and atopy in European and Asian populations, but not in African Americans.Because populations of African descent have increased levels of genetic variation compared with other populations, particularly with respect to low frequency or rare variants, we hypothesized that rare variants in the IL4 gene contribute to the development of asthma in African Americans.To test this hypothesis, we sequenced the IL4 locus in 72 African Americans with asthma and 70 African American controls without asthma to identify novel and rare polymorphisms in the IL4 gene that may be contributing to asthma susceptibility.We report an excess of private noncoding single nucleotide polymorphisms (SNPs) in the subjects with asthma compared with control subjects without asthma (P = .031). Tajima's D is significantly more negative in subjects with asthma (-0.375) than controls (-0.073; P = .04), reflecting an excess of rare variants in the subjects with asthma.Our findings indicate that SNPs at the IL4 locus that are potentially exclusive to African Americans are associated with susceptibility to asthma. Only 3 of the 26 private SNPs (ie, SNPs present only in the subjects with asthma or only in the controls) are tagged by single SNPs on one of the common genotyping platforms used in genome-wide association studies. We also find that most of the private SNPs cannot be reliably imputed, highlighting the importance of sequencing to identify genetic variants contributing to common diseases in African Americans.
Project description:Previous expression quantitative trait loci (eQTL) studies have performed genetic association studies for gene expression, but most of these studies examined lymphoblastoid cell lines from non-diseased individuals. We examined the genetics of gene expression in a relevant disease tissue from chronic obstructive pulmonary disease (COPD) patients to identify functional effects of known susceptibility genes and to find novel disease genes. By combining gene expression profiling on induced sputum samples from 131 COPD cases from the ECLIPSE Study with genomewide single nucleotide polymorphism (SNP) data, we found 4315 significant cis-eQTL SNP-probe set associations (3309 unique SNPs). The 3309 SNPs were tested for association with COPD in a genomewide association study (GWAS) dataset, which included 2940 COPD cases and 1380 controls. Adjusting for 3309 tests (p<1.5e-5), the two SNPs which were significantly associated with COPD were located in two separate genes in a known COPD locus on chromosome 15: CHRNA5 and IREB2. Detailed analysis of chromosome 15 demonstrated additional eQTLs for IREB2 mapping to that gene. eQTL SNPs for CHRNA5 mapped to multiple linkage disequilibrium (LD) bins. The eQTLs for IREB2 and CHRNA5 were not in LD. Seventy-four additional eQTL SNPs were associated with COPD at p<0.01. These were genotyped in two COPD populations, finding replicated associations with a SNP in PSORS1C1, in the HLA-C region on chromosome 6. Integrative analysis of GWAS and gene expression data from relevant tissue from diseased subjects has located potential functional variants in two known COPD genes and has identified a novel COPD susceptibility locus.
Project description:Psoriasis, an immune-mediated, inflammatory disease of the skin and joints, provides an ideal system for expression quantitative trait locus (eQTL) analysis, because it has a strong genetic basis and disease-relevant tissue (skin) is readily accessible. To better understand the role of genetic variants regulating cutaneous gene expression, we identified 841 cis-acting eQTLs using RNA extracted from skin biopsies of 53 psoriatic individuals and 57 healthy controls. We found substantial overlap between cis-eQTLs of normal control, uninvolved psoriatic, and lesional psoriatic skin. Consistent with recent studies and with the idea that control of gene expression can mediate relationships between genetic variants and disease risk, we found that eQTL SNPs are more likely to be associated with psoriasis than are randomly selected SNPs. To explore the tissue specificity of these eQTLs and hence to quantify the benefits of studying eQTLs in different tissues, we developed a refined statistical method for estimating eQTL overlap and used it to compare skin eQTLs to a published panel of lymphoblastoid cell line (LCL) eQTLs. Our method accounts for the fact that most eQTL studies are likely to miss some true eQTLs as a result of power limitations and shows that ?70% of cis-eQTLs in LCLs are shared with skin, as compared with the naive estimate of < 50% sharing. Our results provide a useful method for estimating the overlap between various eQTL studies and provide a catalog of cis-eQTLs in skin that can facilitate efforts to understand the functional impact of identified susceptibility variants on psoriasis and other skin traits.
Project description:BACKGROUND:Bipolar disorder is a highly heritable polygenic disorder. Recent enrichment analyses suggest that there may be true risk variants for bipolar disorder in the expression quantitative trait loci (eQTL) in the brain. AIMS:We sought to assess the impact of eQTL variants on bipolar disorder risk by combining data from both bipolar disorder genome-wide association studies (GWAS) and brain eQTL. METHOD:To detect single nucleotide polymorphisms (SNPs) that influence expression levels of genes associated with bipolar disorder, we jointly analysed data from a bipolar disorder GWAS (7481 cases and 9250 controls) and a genome-wide brain (cortical) eQTL (193 healthy controls) using a Bayesian statistical method, with independent follow-up replications. The identified risk SNP was then further tested for association with hippocampal volume (n = 5775) and cognitive performance (n = 342) among healthy individuals. RESULTS:Integrative analysis revealed a significant association between a brain eQTL rs6088662 on chromosome 20q11.22 and bipolar disorder (log Bayes factor = 5.48; bipolar disorder P = 5.85 × 10(-5)). Follow-up studies across multiple independent samples confirmed the association of the risk SNP (rs6088662) with gene expression and bipolar disorder susceptibility (P = 3.54 × 10(-8)). Further exploratory analysis revealed that rs6088662 is also associated with hippocampal volume and cognitive performance in healthy individuals. CONCLUSIONS:Our findings suggest that 20q11.22 is likely a risk region for bipolar disorder; they also highlight the informative value of integrating functional annotation of genetic variants for gene expression in advancing our understanding of the biological basis underlying complex disorders, such as bipolar disorder.
Project description:Cardiovascular disease (CVD) reflects a highly coordinated complex of traits. Although genome-wide association studies have reported numerous single nucleotide polymorphisms (SNPs) to be associated with CVD, the role of most of these variants in disease processes remains unknown.We built a CVD network using 1512 SNPs associated with 21 CVD traits in genome-wide association studies (at P?5×10(-8)) and cross-linked different traits by virtue of their shared SNP associations. We then explored whole blood gene expression in relation to these SNPs in 5257 participants in the Framingham Heart Study. At a false discovery rate <0.05, we identified 370 cis-expression quantitative trait loci (eQTLs; SNPs associated with altered expression of nearby genes) and 44 trans-eQTLs (SNPs associated with altered expression of remote genes). The eQTL network revealed 13 CVD-related modules. Searching for association of eQTL genes with CVD risk factors (lipids, blood pressure, fasting blood glucose, and body mass index) in the same individuals, we found examples in which the expression of eQTL genes was significantly associated with these CVD phenotypes. In addition, mediation tests suggested that a subset of SNPs previously associated with CVD phenotypes in genome-wide association studies may exert their function by altering expression of eQTL genes (eg, LDLR and PCSK7), which in turn may promote interindividual variation in phenotypes.Using a network approach to analyze CVD traits, we identified complex networks of SNP-phenotype and SNP-transcript connections. Integrating the CVD network with phenotypic data, we identified biological pathways that may provide insights into potential drug targets for treatment or prevention of CVD.
Project description:Interpreting the susceptible loci documented by genome-wide association studies (GWASs) is of utmost importance in the post-GWAS era. Since most complex traits are contributed by multiple tissues, analyzing tissue-specific effects of expression quantitative trait loci (eQTLs) is a promising approach. Here we describe "opposite eQTL effects", i.e., gene expression effects of eQTLs that are in the opposite direction between different tissues, as the biologically meaningful annotations of genes and genetic variants for understanding the GWAS loci. The genes and single-nucleotide polymorphisms (SNPs) associated with the opposite eQTL effects (opp-multi-eQTL-Genes and opp-multi-eQTL-SNPs) were extracted from the largest eQTL database provided by the Genotype-Tissue Expression (GTEx) project (release version 7). The opposite eQTL effects were detected even between closely related tissues such as cerebellum and brain cortex, and a significant proportion of the genes having eQTLs were annotated as the opp-multi-eQTL-Genes (2,323 out of 31,212; 7.4%). The opp-multi-eQTL-SNPs showed locational enrichment at the transcription start site and also possible involvement of epigenetic regulation. The biological importance of the opposite eQTL effects was also assessed using the SNPs reported in GWASs (GWAS-SNPs), which demonstrated that a high proportion of the opp-multi-eQTL-SNPs are in linkage disequilibrium with the GWAS-SNPs (2,498 out of 9,290; 26.9%). Based on the results, the opposite eQTL effects can be a common phenomenon in the tissue-specific gene regulation with a possible contribution to the development of complex traits.