Systematic identification of trans eQTLs as putative drivers of known disease associations.
ABSTRACT: Identifying the downstream effects of disease-associated SNPs is challenging. To help overcome this problem, we performed expression quantitative trait locus (eQTL) meta-analysis in non-transformed peripheral blood samples from 5,311 individuals with replication in 2,775 individuals. We identified and replicated trans eQTLs for 233 SNPs (reflecting 103 independent loci) that were previously associated with complex traits at genome-wide significance. Some of these SNPs affect multiple genes in trans that are known to be altered in individuals with disease: rs4917014, previously associated with systemic lupus erythematosus (SLE), altered gene expression of C1QB and five type I interferon response genes, both hallmarks of SLE. DeepSAGE RNA sequencing showed that rs4917014 strongly alters the 3' UTR levels of IKZF1 in cis, and chromatin immunoprecipitation and sequencing analysis of the trans-regulated genes implicated IKZF1 as the causal gene. Variants associated with cholesterol metabolism and type 1 diabetes showed similar phenomena, indicating that large-scale eQTL mapping provides insight into the downstream effects of many trait-associated variants.
Project description:Background: Prioritizing tag-SNPs carried on extended risk haplotypes at susceptibility loci for common disease is a challenge. Methods: We utilized trans-ancestral exclusion mapping to reduce risk haplotypes at IKZF1 and IKZF3 identified in multiple ancestries from SLE GWAS and ImmunoChip datasets. We characterized functional annotation data across each risk haplotype from publicly available datasets including ENCODE, RoadMap Consortium, PC Hi-C data from 3D genome browser, NESDR NTR conditional eQTL database, GeneCards Genehancers and TF (transcription factor) binding sites from Haploregv4. Results: We refined the 60 kb associated haplotype upstream of IKZF1 to just 12 tag-SNPs tagging a 47.7 kb core risk haplotype. There was preferential enrichment of DNAse I hypersensitivity and H3K27ac modification across the 3? end of the risk haplotype, with four tag-SNPs sharing allele-specific TF binding sites with promoter variants, which are eQTLs for IKZF1 in whole blood. At IKZF3, we refined a core risk haplotype of 101 kb (27 tag-SNPs) from an initial extended haplotype of 194 kb (282 tag-SNPs), which had widespread DNAse I hypersensitivity, H3K27ac modification and multiple allele-specific TF binding sites. Dimerization of Fox family TFs bound at the 3? and promoter of IKZF3 may stabilize chromatin looping across the locus. Conclusions: We combined trans-ancestral exclusion mapping and epigenetic annotation to identify variants at both IKZF1 and IKZF3 with the highest likelihood of biological relevance. The approach will be of strong interest to other complex trait geneticists seeking to attribute biological relevance to risk alleles on extended risk haplotypes in their disease of interest.
Project description:Many disease-associated variants affect gene expression levels (expression quantitative trait loci, eQTLs) and expression profiling using next generation sequencing (NGS) technology is a powerful way to detect these eQTLs. We analyzed 94 total blood samples from healthy volunteers with DeepSAGE to gain specific insight into how genetic variants affect the expression of genes and lengths of 3'-untranslated regions (3'-UTRs). We detected previously unknown cis-eQTL effects for GWAS hits in disease- and physiology-associated traits. Apart from cis-eQTLs that are typically easily identifiable using microarrays or RNA-sequencing, DeepSAGE also revealed many cis-eQTLs for antisense and other non-coding transcripts, often in genomic regions containing retrotransposon-derived elements. We also identified and confirmed SNPs that affect the usage of alternative polyadenylation sites, thereby potentially influencing the stability of messenger RNAs (mRNA). We then combined the power of RNA-sequencing with DeepSAGE by performing a meta-analysis of three datasets, leading to the identification of many more cis-eQTLs. Our results indicate that DeepSAGE data is useful for eQTL mapping of known and unknown transcripts, and for identifying SNPs that affect alternative polyadenylation. Because of the inherent differences between DeepSAGE and RNA-sequencing, our complementary, integrative approach leads to greater insight into the molecular consequences of many disease-associated variants.
Project description:Trans-eQTLs have been implicated in complex traits and common diseases, but many were initially identified on the basis of having an effect in cis, and there has been no assessment of the significance of the overlap in relation to chance expectations. Here, we investigated whether trans-expression quantitative trait loci (eQTL) associations identified in whole blood contribute to variance in complex traits by determining (1) whether genome-wide significant (GWS) single-nucleotide polymorphisms (SNPs) were enriched for trans-eQTL (including trans-only eQTL), and (2) whether the genomic regions surrounding associated trans-genes were enriched for statistical associations in the relevant GWAS. On average for a given phenotype, we identify 4.8% of GWS SNPs overlapping with trans-eQTL present in blood, and show that for the majority of these phenotypes, this observation does not exceed that expected by chance. Likewise, we observe no enrichment for genetic associations with the GWAS phenotype in the regions surrounding the linked trans-genes, with the exception of rheumatoid arthritis. Interestingly, the GWS SNPs for each phenotype were consistently more enriched for unique trans-eQTL SNPs than trans-eQTL SNP-probe pairs (p?=?4?×?10-7), with schizophrenia the only exception. This relative enrichment for trans-eQTL SNPs over trans-eQTL SNP-probe pairs implies that trait-associated trans-eQTL SNPs in whole blood are less likely to be 'master regulators' than random trans-eQTL SNPs. Taken together, these results suggest little evidence for the role of blood-based trans-eQTL in complex traits and disease, although this may reflect the finite size of currently available data sets and our findings may not hold for trans-eQTLs in more trait-relevant tissues. All software is publically available at https://github.com/IMB-Computational-Genomics-Lab/eqtlOverlapper .
Project description:Although rare variant C1Q deficiency was identified as causative risk for systemic lupus erythematosus (SLE), there are limited and inconsistent reports regarding the common polymorphisms of C1Q genes in SLE susceptibility. Furthermore, there are no reports concerning polymorphisms of C1S, C1R, and C1RL and whether they confer susceptibility to SLE. We therefore evaluated 22 SNPs across six C1-complex genes in two independent case-control cohorts, and identified four novel SNPs that confer protection from SLE. The four SNPs are all located in C1Q. Particularly, the variant rs653286 displayed an independent reduced risk on SLE susceptibility (OR 0.75, P?=?2.16?×?10-3) and anti-dsDNA antibodies (OR 0.68, P?=?0.024). By bioinformatics analysis, SNPs rs653286 and rs291985 displayed striking cis-eQTL effects on C1Q genes expression. Individuals homozygous for the 'protective' allele at four SNPs had significantly higher levels of serum C1q (rs680123-rs682658: P?=?0.0022; rs653286-rs291985: P?=?0.0076). To our knowledge, this is the first study to demonstrate that only C1Q polymorphisms are associated with SLE. The C1Q SNP rs653286 confers an independent protective effect on SLE susceptibility and affects transcript abundance.
Project description:For many complex traits, genetic variants have been found associated. However, it is still mostly unclear through which downstream mechanism these variants cause these phenotypes. Knowledge of these intermediate steps is crucial to understand pathogenesis, while also providing leads for potential pharmacological intervention. Here we relied upon natural human genetic variation to identify effects of these variants on trans-gene expression (expression quantitative trait locus mapping, eQTL) in whole peripheral blood from 1,469 unrelated individuals. We looked at 1,167 published trait- or disease-associated SNPs and observed trans-eQTL effects on 113 different genes, of which we replicated 46 in monocytes of 1,490 different individuals and 18 in a smaller dataset that comprised subcutaneous adipose, visceral adipose, liver tissue, and muscle tissue. HLA single-nucleotide polymorphisms (SNPs) were 10-fold enriched for trans-eQTLs: 48% of the trans-acting SNPs map within the HLA, including ulcerative colitis susceptibility variants that affect plausible candidate genes AOAH and TRBV18 in trans. We identified 18 pairs of unlinked SNPs associated with the same phenotype and affecting expression of the same trans-gene (21 times more than expected, P<10(-16)). This was particularly pronounced for mean platelet volume (MPV): Two independent SNPs significantly affect the well-known blood coagulation genes GP9 and F13A1 but also C19orf33, SAMD14, VCL, and GNG11. Several of these SNPs have a substantially higher effect on the downstream trans-genes than on the eventual phenotypes, supporting the concept that the effects of these SNPs on expression seems to be much less multifactorial. Therefore, these trans-eQTLs could well represent some of the intermediate genes that connect genetic variants with their eventual complex phenotypic outcomes.
Project description:Protein tyrosine phosphatase non-receptor type 22 (PTPN22) is a negative regulator of T-cell activation associated with several autoimmune diseases, including systemic lupus erythematosus (SLE). Missense rs2476601 is associated with SLE in individuals with European ancestry. Since the rs2476601 risk allele frequency differs dramatically across ethnicities, we assessed robustness of PTPN22 association with SLE and its clinical sub-phenotypes across four ethnically diverse populations. Ten SNPs were genotyped in 8220 SLE cases and 7369 controls from in European-Americans (EA), African-Americans (AA), Asians (AS), and Hispanics (HS). We performed imputation-based association followed by conditional analysis to identify independent associations. Significantly associated SNPs were tested for association with SLE clinical sub-phenotypes, including autoantibody profiles. Multiple testing was accounted for by using false discovery rate. We successfully imputed and tested allelic association for 107 SNPs within the PTPN22 region and detected evidence of ethnic-specific associations from EA and HS. In EA, the strongest association was at rs2476601 (P = 4.7 × 10(-9), OR = 1.40 (95% CI = 1.25-1.56)). Independent association with rs1217414 was also observed in EA, and both SNPs are correlated with increased European ancestry. For HS imputed intronic SNP, rs3765598, predicted to be a cis-eQTL, was associated (P = 0.007, OR = 0.79 and 95% CI = 0.67-0.94). No significant associations were observed in AA or AS. Case-only analysis using lupus-related clinical criteria revealed differences between EA SLE patients positive for moderate to high titers of IgG anti-cardiolipin (aCL IgG >20) versus negative aCL IgG at rs2476601 (P = 0.012, OR = 1.65). Association was reinforced when these cases were compared to controls (P = 2.7 × 10(-5), OR = 2.11). Our results validate that rs2476601 is the most significantly associated SNP in individuals with European ancestry. Additionally, rs1217414 and rs3765598 may be associated with SLE. Further studies are required to confirm the involvement of rs2476601 with aCL IgG.
Project description:Recent genome-wide association studies (GWASs) conducted in Asian populations have identified novel risk loci for systemic lupus erythematosus (SLE). Here, we genotyped 10 single-nucleotide polymorphisms (SNPs) in eight such loci and investigated their disease associations in three independent Caucasian SLE case-control cohorts recruited from Sweden, Finland and the United States. The disease associations of the SNPs in ETS1, IKZF1, LRRC18-WDFY4, RASGRP3, SLC15A4, TNIP1 and 16p11.2 were replicated, whereas no solid evidence of association was observed for the 7q11.23 locus in the Caucasian cohorts. SLC15A4 was significantly associated with renal involvement in SLE. The association of TNIP1 was more pronounced in SLE patients with renal and immunological disorder, which is corroborated by two previous studies in Asian cohorts. The effects of all the associated SNPs, either conferring risk for or being protective against SLE, were in the same direction in Caucasians and Asians. The magnitudes of the allelic effects for most of the SNPs were also comparable across different ethnic groups. On the contrary, remarkable differences in allele frequencies between Caucasian and Asian populations were observed for all associated SNPs. In conclusion, most of the novel SLE risk loci identified by GWASs in Asian populations were also associated with SLE in Caucasian populations. We observed both similarities and differences with respect to the effect sizes and risk allele frequencies across ethnicities.
Project description:BACKGROUND:Identification of single nucleotide polymorphisms (SNPs) associated with gene expression levels, known as expression quantitative trait loci (eQTLs), may improve understanding of the functional role of phenotype-associated SNPs in genome-wide association studies (GWAS). The small sample sizes of some previous eQTL studies have limited their statistical power. We conducted an eQTL investigation of microarray-based gene and exon expression levels in whole blood in a cohort of 5257 individuals, exceeding the single cohort size of previous studies by more than a factor of 2. RESULTS:We detected over 19,000 independent lead cis-eQTLs and over 6000 independent lead trans-eQTLs, targeting over 10,000 gene targets (eGenes), with a false discovery rate (FDR)?<?5%. Of previously published significant GWAS SNPs, 48% are identified to be significant eQTLs in our study. Some trans-eQTLs point toward novel mechanistic explanations for the association of the SNP with the GWAS-related phenotype. We also identify 59 distinct blocks or clusters of trans-eQTLs, each targeting the expression of sets of six to 229 distinct trans-eGenes. Ten of these sets of target genes are significantly enriched for microRNA targets (FDR?<?5%). Many of these clusters are associated in GWAS with multiple phenotypes. CONCLUSIONS:These findings provide insights into the molecular regulatory patterns involved in human physiology and pathophysiology. We illustrate the value of our eQTL database in the context of a recent GWAS meta-analysis of coronary artery disease and provide a list of targeted eGenes for 21 of 58 GWAS loci.
Project description:Genome wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with diseases of the colon including inflammatory bowel diseases (IBD) and colorectal cancer (CRC). However, the functional role of many of these SNPs is largely unknown and tissue-specific resources are lacking. Expression quantitative trait loci (eQTL) mapping identifies target genes of disease-associated SNPs. Here, we comprehensively map eQTLs in the human colon, assess their relevance for GWAS of colonic diseases and provide functional characterization. Subjects included 40 healthy African American individuals who had undergone colonoscopy at the University of Illinois Chicago for screening purposes. Distal colonic biopsies were obtained in all subjects at 20 cm from the anal verge at the recto-sigmoid junction and were immediately dispensed in RNAlater. Total mRNA was extracted from manually ground tissue with the Promega Maxwell 16 Tissue LEV Total RNA Purification Kit for automated purification on the Maxwell 16 Instrument and mRNA analysis was performed on Illumina HumanHT-12v4 Expression BeadChip arrays. Genomic DNA was obtained from whole-blood samples from the same individuals and genotyped using the Affymetrix Axiom Genome-wide Pan-African array. Cis- and trans-eQTL analyses were performed on the dataset of 8.4 million imputed SNPs and 16,252 expression probes corresponding to 12,363 unique autosomal genes in 40 subjects. Associations between SNPs and gene expression levels were examined with Matrix eQTL using linear regression. False discovery rate calculations were performed separately for cis- and trans-eQTLs.
Project description:Systemic lupus erythematosus (SLE) is a chronic autoimmune disorder whose etiology is incompletely understood, but likely involves environmental triggers in genetically susceptible individuals. Using an unbiased genome-wide association (GWA) scan and replication analysis, we sought to identify the genetic loci associated with SLE in a Korean population.A total of 1,174 SLE cases and 4,246 population controls from Korea were genotyped and analyzed with a GWA scan to identify single-nucleotide polymorphisms (SNPs) significantly associated with SLE, after strict quality control measures were applied. For select variants, replication of SLE risk loci was tested in an independent data set of 1,416 SLE cases and 1,145 population controls from Korea and China.Eleven regions outside the HLA exceeded the genome-wide significance level (P?=?5 × 10(-8) ). A novel SNP-SLE association was identified between FCHSD2 and P2RY2, peaking at rs11235667 (P?=?1.03 × 10(-8) , odds ratio [OR] 0.59) on a 33-kb haplotype upstream of ATG16L2. In the independent replication data set, the SNP rs11235667 continued to show a significant association with SLE (replication meta-analysis P?=?0.001, overall meta-analysis P?=?6.67 × 10(-11) ; OR 0.63). Within the HLA region, the SNP-SLE association peaked in the class II region at rs116727542, with multiple independent effects observed in this region. Classic HLA allele imputation analysis identified HLA-DRB1*1501 and HLA-DQB1*0602, each highly correlated with one another, as most strongly associated with SLE. Ten previously established SLE risk loci were replicated: STAT1-STAT4, TNFSF4, TNFAIP3, IKZF1, HIP1, IRF5, BLK, WDFY4, ETS1, and IRAK1-MECP2. Of these loci, previously unreported, independent second risk effects of SNPs in TNFAIP3 and TNFSF4, as well as differences in the association with a putative causal variant in the WDFY4 region, were identified.Further studies are needed to identify true SLE risk effects in other loci suggestive of a significant association, and to identify the causal variants in the regions of ATG16L2, FCHSD2, and P2RY2.