Gene expression in skin and lymphoblastoid cells: Refined statistical method reveals extensive overlap in cis-eQTL signals.
ABSTRACT: Psoriasis, an immune-mediated, inflammatory disease of the skin and joints, provides an ideal system for expression quantitative trait locus (eQTL) analysis, because it has a strong genetic basis and disease-relevant tissue (skin) is readily accessible. To better understand the role of genetic variants regulating cutaneous gene expression, we identified 841 cis-acting eQTLs using RNA extracted from skin biopsies of 53 psoriatic individuals and 57 healthy controls. We found substantial overlap between cis-eQTLs of normal control, uninvolved psoriatic, and lesional psoriatic skin. Consistent with recent studies and with the idea that control of gene expression can mediate relationships between genetic variants and disease risk, we found that eQTL SNPs are more likely to be associated with psoriasis than are randomly selected SNPs. To explore the tissue specificity of these eQTLs and hence to quantify the benefits of studying eQTLs in different tissues, we developed a refined statistical method for estimating eQTL overlap and used it to compare skin eQTLs to a published panel of lymphoblastoid cell line (LCL) eQTLs. Our method accounts for the fact that most eQTL studies are likely to miss some true eQTLs as a result of power limitations and shows that ?70% of cis-eQTLs in LCLs are shared with skin, as compared with the naive estimate of < 50% sharing. Our results provide a useful method for estimating the overlap between various eQTL studies and provide a catalog of cis-eQTLs in skin that can facilitate efforts to understand the functional impact of identified susceptibility variants on psoriasis and other skin traits.
Project description:Expression quantitative trait locus (eQTL) analysis, which links variations in gene expression to genotypes, is essential to understanding gene regulation and to interpreting disease-associated loci. Currently identified eQTLs are mainly in samples of blood and other normal tissues. However, no database comprehensively provides eQTLs in large number of cancer samples. Using the genotype and expression data of 9196 tumor samples in 33 cancer types from The Cancer Genome Atlas (TCGA), we identified 5 606 570 eQTL-gene pairs in the cis-eQTL analysis and 231 210 eQTL-gene pairs in the trans-eQTL analysis. We further performed survival analysis and identified 22 212 eQTLs associated with patient overall survival. Furthermore, we linked the eQTLs to genome-wide association studies (GWAS) data and identified 337 131 eQTLs that overlap with existing GWAS loci. We developed PancanQTL, a user-friendly database (http://bioinfo.life.hust.edu.cn/PancanQTL/), to store cis-eQTLs, trans-eQTLs, survival-associated eQTLs and GWAS-related eQTLs to enable searching, browsing and downloading. PancanQTL could help the research community understand the effects of inherited variants in tumorigenesis and development.
Project description:Most expression quantitative trait locus (eQTL) studies to date have been performed in heterogeneous tissues as opposed to specific cell types. To better understand the cell-type-specific regulatory landscape of human melanocytes, which give rise to melanoma but account for <5% of typical human skin biopsies, we performed an eQTL analysis in primary melanocyte cultures from 106 newborn males. We identified 597,335 cis-eQTL SNPs prior to linkage disequilibrium (LD) pruning and 4997 eGenes (FDR < 0.05). Melanocyte eQTLs differed considerably from those identified in the 44 GTEx tissue types, including skin. Over a third of melanocyte eGenes, including key genes in melanin synthesis pathways, were unique to melanocytes compared to those of GTEx skin tissues or TCGA melanomas. The melanocyte data set also identified trans-eQTLs, including those connecting a pigmentation-associated functional SNP with four genes, likely through cis-regulation of IRF4 Melanocyte eQTLs are enriched in cis-regulatory signatures found in melanocytes as well as in melanoma-associated variants identified through genome-wide association studies. Melanocyte eQTLs also colocalized with melanoma GWAS variants in five known loci. Finally, a transcriptome-wide association study using melanocyte eQTLs uncovered four novel susceptibility loci, where imputed expression levels of five genes (ZFP90, HEBP1, MSC, CBWD1, and RP11-383H13.1) were associated with melanoma at genome-wide significant P-values. Our data highlight the utility of lineage-specific eQTL resources for annotating GWAS findings, and present a robust database for genomic research of melanoma risk and melanocyte biology.
Project description:The simplest definition of cis-eQTLs versus trans, refers to genetic variants that affect expression in an allele specific manner, with implications on underlying mechanism. Yet, due to technical limitations of expression microarrays, the vast majority of eQTL studies performed in the last decade used a genomic distance based definition as a surrogate for cis, therefore exploring local rather than cis-eQTLs.In this study we use RNAseq to explore allele specific expression (ASE) in adipose tissue of male and female F1 mice, produced from reciprocal crosses of C57BL/6J and DBA/2J strains. Comparison of the identified cis-eQTLs, to local-eQTLs, that were obtained from adipose tissue expression in two previous population based studies in our laboratory, yields poor overlap between the two mapping approaches, while both local-eQTL studies show highly concordant results. Specifically, local-eQTL studies show ~60% overlap between themselves, while only 15-20% of local-eQTLs are identified as cis by ASE, and less than 50% of ASE genes are recovered in local-eQTL studies. Utilizing recently published ENCODE data, we also find that ASE genes show significant bias for SNPs prevalence in DNase I hypersensitive sites that is ASE direction specific.We suggest a new approach to analysis of allele specific expression that is more sensitive and accurate than the commonly used fisher or chi-square statistics. Our analysis indicates that technical differences between the cis and local-eQTL approaches, such as differences in genomic background or sex specificity, account for relatively small fraction of the discrepancy. Therefore, we suggest that the differences between two eQTL mapping approaches may facilitate sorting of SNP-eQTL interactions into true cis and trans, and that a considerable portion of local-eQTL may actually represent trans interactions.
Project description:Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL) mapping has paralleled the adoption of genome-wide association studies (GWAS) for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE) data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human complex phenotypic variation.
Project description:Most signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, key tissues and cell-types required for functional inference are absent from large-scale resources. Here we explore the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using data from 420 donors. We find: (a) 7741 cis-eQTLs in islets with a replication rate across 44 GTEx tissues between 40% and 73%; (b) marked overlap between islet cis-eQTL signals and active regulatory sequences in islets, with reduced eQTL effect size observed in the stretch enhancers most strongly implicated in GWAS signal location; (c) enrichment of islet cis-eQTL signals with T2D risk variants identified in genome-wide association studies; and (d) colocalization between 47 islet cis-eQTLs and variants influencing T2D or glycemic traits, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in disease relevant tissues.
Project description:Disease variants identified by genome-wide association studies (GWAS) tend to overlap with expression quantitative trait loci (eQTLs), but it remains unclear whether this overlap is driven by gene expression levels 'mediating' genetic effects on disease. Here, we introduce a new method, mediated expression score regression (MESC), to estimate disease heritability mediated by the cis genetic component of gene expression levels. We applied MESC to GWAS summary statistics for 42 traits (average N?=?323,000) and cis-eQTL summary statistics for 48 tissues from the Genotype-Tissue Expression (GTEx) consortium. Averaging across traits, only 11?±?2% of heritability was mediated by assayed gene expression levels. Expression-mediated heritability was enriched in genes with evidence of selective constraint and genes with disease-appropriate annotations. Our results demonstrate that assayed bulk tissue eQTLs, although disease relevant, cannot explain the majority of disease heritability.
Project description:Emerging evidence emphasizes the strong impact of regulatory genomic elements in neurodevelopmental processes and the complex pathways of brain disorders. The present genome-wide quantitative trait loci analyses explore the cis-regulatory effects of single-nucleotide polymorphisms (SNPs) on DNA methylation (meQTL) and gene expression (eQTL) in 110 human hippocampal biopsies. We identify cis-meQTLs at 14,118 CpG methylation sites and cis-eQTLs for 302 3'-mRNA transcripts of 288 genes. Hippocampal cis-meQTL-CpGs are enriched in flanking regions of active promoters, CpG island shores, binding sites of the transcription factor CTCF and brain eQTLs. Cis-acting SNPs of hippocampal meQTLs and eQTLs significantly overlap schizophrenia-associated SNPs. Correlations of CpG methylation and RNA expression are found for 34 genes. Our comprehensive maps of cis-acting hippocampal meQTLs and eQTLs provide a link between disease-associated SNPs and the regulatory genome that will improve the functional interpretation of non-coding genetic variants in the molecular genetic dissection of brain disorders.
Project description:Genome-wide association studies (GWAS) have identified numerous prostate cancer-associated risk loci. Some variants at these loci may be regulatory and influence expression of nearby genes. Such loci are known as cis-expression quantitative trait loci (cis-eQTL). As cis-eQTLs are highly tissue-specific, we asked if GWAS-identified prostate cancer risk loci are cis-eQTLs in human prostate tumor tissues. We investigated 50 prostate cancer samples for their genotype at 59 prostate cancer risk-associated single-nucleotide polymorphisms (SNPs) and performed cis-eQTL analysis of transcripts from paired primary tumors within two megabase windows. We tested 586 transcript-genotype associations, of which 27 were significant (false discovery rate ?10%). An equivalent eQTL analysis of the same prostate cancer risk loci in lymphoblastoid cell lines did not result in any significant associations. The top-ranked cis-eQTL involved the IRX4 (Iroquois homeobox protein 4) transcript and rs12653946, tagged by rs10866528 in our study (P=4.91 × 10(-5)). Replication studies, linkage disequilibrium, and imputation analyses highlight population specificity at this locus. We independently validated IRX4 as a potential prostate cancer risk gene through cis-eQTL analysis of prostate cancer risk variants. Cis-eQTL analysis in relevant tissues, even with a small sample size, can be a powerful method to expedite functional follow-up of GWAS.
Project description:Epilepsy is a serious and common neurological disorder. Expression quantitative loci (eQTL) analysis is a vital aid for the identification and interpretation of disease-risk loci. Many eQTLs operate in a tissue- and condition-specific manner. We have performed the first genome-wide cis-eQTL analysis of human hippocampal tissue to include not only normal (n?=?22) but also epileptic (n?=?22) samples. We demonstrate that disease-associated variants from an epilepsy GWAS meta-analysis and a febrile seizures (FS) GWAS are significantly more enriched with epilepsy-eQTLs than with normal hippocampal eQTLs from two larger independent published studies. In contrast, GWAS meta-analyses of two other brain diseases associated with hippocampal pathology (Alzheimer's disease and schizophrenia) are more enriched with normal hippocampal eQTLs than with epilepsy-eQTLs. These observations suggest that an eQTL analysis that includes disease-affected brain tissue is advantageous for detecting additional risk SNPs for the afflicting and closely related disorders, but not for distinct diseases affecting the same brain regions. We also show that epilepsy eQTLs are enriched within epilepsy-causing genes: an epilepsy cis-gene is significantly more likely to be a causal gene for a Mendelian epilepsy syndrome than to be a causal gene for another Mendelian disorder. Epilepsy cis-genes, compared to normal hippocampal cis-genes, are more enriched within epilepsy-causing genes. Hence, we utilize the epilepsy eQTL data for the functional interpretation of epilepsy disease-risk variants and, thereby, highlight novel potential causal genes for sporadic epilepsy. In conclusion, an epilepsy-eQTL analysis is superior to normal hippocampal tissue eQTL analyses for identifying the variants and genes underlying epilepsy.
Project description:The functional consequences of trait associated SNPs are often investigated using expression quantitative trait locus (eQTL) mapping. While trait-associated variants may operate in a cell-type specific manner, eQTL datasets for such cell-types may not always be available. We performed a genome-environment interaction (GxE) meta-analysis on data from 5,683 samples to infer the cell type specificity of whole blood cis-eQTLs. We demonstrate that this method is able to predict neutrophil and lymphocyte specific cis-eQTLs and replicate these predictions in independent cell-type specific datasets. Finally, we show that SNPs associated with Crohn's disease preferentially affect gene expression within neutrophils, including the archetypal NOD2 locus.