Genome-wide co-expression analysis in multiple tissues.
ABSTRACT: Expression quantitative trait loci (eQTLs) represent genetic control points of gene expression, and can be categorized as cis- and trans-acting, reflecting local and distant regulation of gene expression respectively. Although there is evidence of co-regulation within clusters of trans-eQTLs, the extent of co-expression patterns and their relationship with the genotypes at eQTLs are not fully understood. We have mapped thousands of cis- and trans-eQTLs in four tissues (fat, kidney, adrenal and left ventricle) in a large panel of rat recombinant inbred (RI) strains. Here we investigate the genome-wide correlation structure in expression levels of eQTL transcripts and underlying genotypes to elucidate the nature of co-regulation within cis- and trans-eQTL datasets. Across the four tissues, we consistently found statistically significant correlations of cis-regulated gene expression to be rare (<0.9% of all pairs tested). Most (>80%) of the observed significant correlations of cis-regulated gene expression are explained by correlation of the underlying genotypes. In comparison, co-expression of trans-regulated gene expression is more common, with significant correlation ranging from 2.9%-14.9% of all pairs of trans-eQTL transcripts. We observed a total of 81 trans-eQTL clusters (hot-spots), defined as consisting of > or =10 eQTLs linked to a common region, with very high levels of correlation between trans-regulated transcripts (77.2-90.2%). Moreover, functional analysis of large trans-eQTL clusters (> or =30 eQTLs) revealed significant functional enrichment among genes comprising 80% of the large clusters. The results of this genome-wide co-expression study show the effects of the eQTL genotypes on the observed patterns of correlation, and suggest that functional relatedness between genes underlying trans-eQTLs is reflected in the degree of co-expression observed in trans-eQTL clusters. Our results demonstrate the power of an integrative, systematic approach to the analysis of a large gene expression dataset to uncover underlying structure, and inform future eQTL studies.
Project description:Expression quantitative trait locus (eQTL) analysis, which links variations in gene expression to genotypes, is essential to understanding gene regulation and to interpreting disease-associated loci. Currently identified eQTLs are mainly in samples of blood and other normal tissues. However, no database comprehensively provides eQTLs in large number of cancer samples. Using the genotype and expression data of 9196 tumor samples in 33 cancer types from The Cancer Genome Atlas (TCGA), we identified 5 606 570 eQTL-gene pairs in the cis-eQTL analysis and 231 210 eQTL-gene pairs in the trans-eQTL analysis. We further performed survival analysis and identified 22 212 eQTLs associated with patient overall survival. Furthermore, we linked the eQTLs to genome-wide association studies (GWAS) data and identified 337 131 eQTLs that overlap with existing GWAS loci. We developed PancanQTL, a user-friendly database (http://bioinfo.life.hust.edu.cn/PancanQTL/), to store cis-eQTLs, trans-eQTLs, survival-associated eQTLs and GWAS-related eQTLs to enable searching, browsing and downloading. PancanQTL could help the research community understand the effects of inherited variants in tumorigenesis and development.
Project description:Genetics of gene expression (eQTLs or expression QTLs) has proved an indispensable tool for understanding biological pathways and pathomechanisms of trait-associated SNPs. However, power of most genome-wide eQTL studies is still limited. We performed a large eQTL study in peripheral blood mononuclear cells of 2112 individuals increasing the power to detect trans-effects genome-wide. Going beyond univariate SNP-transcript associations, we analyse relations of eQTLs to biological pathways, polygenetic effects of expression regulation, trans-clusters and enrichment of co-localized functional elements. We found eQTLs for about 85% of analysed genes, and 18% of genes were trans-regulated. Local eSNPs were enriched up to a distance of 5 Mb to the transcript challenging typically implemented ranges of cis-regulations. Pathway enrichment within regulated genes of GWAS-related eSNPs supported functional relevance of identified eQTLs. We demonstrate that nearest genes of GWAS-SNPs might frequently be misleading functional candidates. We identified novel trans-clusters of potential functional relevance for GWAS-SNPs of several phenotypes including obesity-related traits, HDL-cholesterol levels and haematological phenotypes. We used chromatin immunoprecipitation data for demonstrating biological effects. Yet, we show for strongly heritable transcripts that still little trans-chromosomal heritability is explained by all identified trans-eSNPs; however, our data suggest that most cis-heritability of these transcripts seems explained. Dissection of co-localized functional elements indicated a prominent role of SNPs in loci of pseudogenes and non-coding RNAs for the regulation of coding genes. In summary, our study substantially increases the catalogue of human eQTLs and improves our understanding of the complex genetic regulation of gene expression, pathways and disease-related processes.
Project description:A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a population-based cohort of 1,799 Bangladeshi individuals to characterize cis- and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P<0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P<10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying widespread cis-mediation and their relevance to disease biology, as well as using mediation analysis to improve eQTL discovery.
Project description:BACKGROUND: We aimed to assess whether whole blood expression quantitative trait loci (eQTLs) with effects in cis and trans are robust and can be used to identify regulatory pathways affecting disease susceptibility. MATERIALS AND METHODS: We performed whole-genome eQTL analyses in 890 participants of the KORA F4 study and in two independent replication samples (SHIP-TREND, N?=?976 and EGCUT, N?=?842) using linear regression models and Bonferroni correction. RESULTS: In the KORA F4 study, 4,116 cis-eQTLs (defined as SNP-probe pairs where the SNP is located within a 500 kb window around the transcription unit) and 94 trans-eQTLs reached genome-wide significance and overall 91% (92% of cis-, 84% of trans-eQTLs) were confirmed in at least one of the two replication studies. Different study designs including distinct laboratory reagents (PAXgene™ vs. Tempus™ tubes) did not affect reproducibility (separate overall replication overlap: 78% and 82%). Immune response pathways were enriched in cis- and trans-eQTLs and significant cis-eQTLs were partly coexistent in other tissues (cross-tissue similarity 40-70%). Furthermore, four chromosomal regions displayed simultaneous impact on multiple gene expression levels in trans, and 746 eQTL-SNPs have been previously reported to have clinical relevance. We demonstrated cross-associations between eQTL-SNPs, gene expression levels in trans, and clinical phenotypes as well as a link between eQTLs and human metabolic traits via modification of gene regulation in cis. CONCLUSIONS: Our data suggest that whole blood is a robust tissue for eQTL analysis and may be used both for biomarker studies and to enhance our understanding of molecular mechanisms underlying gene-disease associations.
Project description:Understanding the causal processes that contribute to disease onset and progression is essential for developing novel therapies. Although trans-acting expression quantitative trait loci (trans-eQTLs) can directly reveal cellular processes modulated by disease variants, detecting trans-eQTLs remains challenging due to their small effect sizes. Here, we analysed gene expression and genotype data from six blood cell types from 226 to 710 individuals. We used co-expression modules inferred from gene expression data with five methods as traits in trans-eQTL analysis to limit multiple testing and improve interpretability. In addition to replicating three established associations, we discovered a novel trans-eQTL near SLC39A8 regulating a module of metallothionein genes in LPS-stimulated monocytes. Interestingly, this effect was mediated by a transient cis-eQTL present only in early LPS response and lost before the trans effect appeared. Our analyses highlight how co-expression combined with functional enrichment analysis improves the identification and prioritisation of trans-eQTLs when applied to emerging cell-type-specific datasets.
Project description:The simplest definition of cis-eQTLs versus trans, refers to genetic variants that affect expression in an allele specific manner, with implications on underlying mechanism. Yet, due to technical limitations of expression microarrays, the vast majority of eQTL studies performed in the last decade used a genomic distance based definition as a surrogate for cis, therefore exploring local rather than cis-eQTLs.In this study we use RNAseq to explore allele specific expression (ASE) in adipose tissue of male and female F1 mice, produced from reciprocal crosses of C57BL/6J and DBA/2J strains. Comparison of the identified cis-eQTLs, to local-eQTLs, that were obtained from adipose tissue expression in two previous population based studies in our laboratory, yields poor overlap between the two mapping approaches, while both local-eQTL studies show highly concordant results. Specifically, local-eQTL studies show ~60% overlap between themselves, while only 15-20% of local-eQTLs are identified as cis by ASE, and less than 50% of ASE genes are recovered in local-eQTL studies. Utilizing recently published ENCODE data, we also find that ASE genes show significant bias for SNPs prevalence in DNase I hypersensitive sites that is ASE direction specific.We suggest a new approach to analysis of allele specific expression that is more sensitive and accurate than the commonly used fisher or chi-square statistics. Our analysis indicates that technical differences between the cis and local-eQTL approaches, such as differences in genomic background or sex specificity, account for relatively small fraction of the discrepancy. Therefore, we suggest that the differences between two eQTL mapping approaches may facilitate sorting of SNP-eQTL interactions into true cis and trans, and that a considerable portion of local-eQTL may actually represent trans interactions.
Project description:Understanding the complexity of the human brain transcriptome architecture is one of the most important human genetics study areas. Previous studies have applied expression quantitative trait loci (eQTL) analysis at the genome-wide level of the brain to understand the underlying mechanisms relating to neurodegenerative diseases, primarily at the transcript level. To increase the resolution of our understanding, the current study investigates multi/single-region, transcript/exon-level and cis versus trans-acting eQTL, across 10 regions of the human brain. Some of the key findings of this study are: (i) only a relatively small proportion of eQTLs will be detected, where the sensitivity is under 5%; (ii) when an eQTL is acting in multiple regions (MR-eQTL), it tends to have very similar effects on gene expression in each of these regions, as well as being cis-acting; (iii) trans-acting eQTLs tend to have larger effects on expression compared to cis-acting eQTLs and tend to be specific to a single region (SR-eQTL) of the brain; (iv) the cerebellum has a very large number of eQTLs that function exclusively in this region, compared with other regions of the brain; (v) importantly, an interactive visualisation tool (Shiny app) was developed to visualise the MR/SR-eQTL at transcript and exon levels.
Project description:BACKGROUND:Identification of single nucleotide polymorphisms (SNPs) associated with gene expression levels, known as expression quantitative trait loci (eQTLs), may improve understanding of the functional role of phenotype-associated SNPs in genome-wide association studies (GWAS). The small sample sizes of some previous eQTL studies have limited their statistical power. We conducted an eQTL investigation of microarray-based gene and exon expression levels in whole blood in a cohort of 5257 individuals, exceeding the single cohort size of previous studies by more than a factor of 2. RESULTS:We detected over 19,000 independent lead cis-eQTLs and over 6000 independent lead trans-eQTLs, targeting over 10,000 gene targets (eGenes), with a false discovery rate (FDR)?<?5%. Of previously published significant GWAS SNPs, 48% are identified to be significant eQTLs in our study. Some trans-eQTLs point toward novel mechanistic explanations for the association of the SNP with the GWAS-related phenotype. We also identify 59 distinct blocks or clusters of trans-eQTLs, each targeting the expression of sets of six to 229 distinct trans-eGenes. Ten of these sets of target genes are significantly enriched for microRNA targets (FDR?<?5%). Many of these clusters are associated in GWAS with multiple phenotypes. CONCLUSIONS:These findings provide insights into the molecular regulatory patterns involved in human physiology and pathophysiology. We illustrate the value of our eQTL database in the context of a recent GWAS meta-analysis of coronary artery disease and provide a list of targeted eGenes for 21 of 58 GWAS loci.
Project description:The spontaneously hypertensive rat (SHR) is a widely used rodent model of hypertension and metabolic syndrome. Previously we identified thousands of cis-regulated expression quantitative trait loci (eQTLs) across multiple tissues using a panel of rat recombinant inbred (RI) strains derived from Brown Norway and SHR progenitors. These cis-eQTLs represent potential susceptibility loci underlying physiological and pathophysiological traits manifested in SHR. We have prioritized 60 cis-eQTLs and confirmed differential expression between the parental strains by quantitative PCR in 43 (72%) of the eQTL transcripts. Quantitative trait transcript (QTT) analysis in the RI strains showed highly significant correlation between cis-eQTL transcript abundance and clinically relevant traits such as systolic blood pressure and blood glucose, with the physical location of a subset of the cis-eQTLs colocalizing with "physiological" QTLs (pQTLs) for these same traits. These colocalizing correlated cis-eQTLs (c3-eQTLs) are highly attractive as primary susceptibility loci for the colocalizing pQTLs. Furthermore, sequence analysis of the c3-eQTL genes identified single nucleotide polymorphisms (SNPs) that are predicted to affect transcription factor binding affinity, splicing and protein function. These SNPs, which potentially alter transcript abundance and stability, represent strong candidate factors underlying not just eQTL expression phenotypes, but also the correlated metabolic and physiological traits. In conclusion, by integration of genomic sequence, eQTL and QTT datasets we have identified several genes that are strong positional candidates for pathophysiological traits observed in the SHR strain. These findings provide a basis for the functional testing and ultimate elucidation of the molecular basis of these metabolic and cardiovascular phenotypes.
Project description:The genetics of gene expression in recombinant inbred lines (RILs) can be mapped as expression quantitative trait loci (eQTLs). So-called "genetical genomics" studies have identified locally acting eQTLs (cis-eQTLs) for genes that show differences in steady-state RNA levels. These studies have also identified distantly acting master-modulatory trans-eQTLs that regulate tens or hundreds of transcripts (hotspots or transbands). We expand on these studies by performing genetical genomics experiments in two environments in order to identify trans-eQTL that might be regulated by developmental exposure to the neurotoxin lead. Flies from each of 75 RIL were raised from eggs to adults on either control food (made with 250 microM sodium acetate), or lead-treated food (made with 250 microM lead acetate, PbAc). RNA expression analyses of whole adult male flies (5-10 days old) were performed with Affymetrix DrosII whole genome arrays (18,952 probesets). Among the 1389 genes with cis-eQTL, there were 405 genes unique to control flies and 544 genes unique to lead-treated ones (440 genes had the same cis-eQTLs in both samples). There are 2396 genes with trans-eQTL which mapped to 12 major transbands with greater than 95 genes. Permutation analyses of the strain labels but not the expression data suggests that the total number of eQTL and the number of transbands are more important criteria for validation than the size of the transband. Two transbands, one located on the 2nd chromosome and one on the 3rd chromosome, co-regulate 33 lead-induced genes, many of which are involved in neurodevelopmental processes. For these 33 genes, rather than allelic variation at one locus exerting differential effects in two environments, we found that variation at two different loci are required for optimal effects on lead-induced expression.