Project description:Genome-wide association studies (GWAS-s) have linked thousands of genetic variants with complex diseases and traits. However, most genetic variants identified reside in non-coding regions of the genome making it hard to predict and understand the molecular mechanisms underlying causal alleles. We have previously performed molecular Quantitative Trait Locus (molQTL) analysis for transcription factor binding, chromatin accessibility, and H3K27 acetylation across primary Human Aortic Endothelial Cell (HAEC) samples. Here we expand this analysis to study more than 30,000 genetic variants using the massively parallel reporter assay (MPRA) STARR-Seq in immortalized HAEC cells (teloHAEC). We demonstrate that more than 5,000 variants exhibit differential expression between alleles and identify ETS and AP-1 motifs as the strongest sequence attributes—and cell type-specific chromatin accessibility as the strongest epigenetic attribute—associated with allele-specific regulatory activity. Using interleukin 1-beta (IL1b) as a modulator of cell state and environment, we observe robust evidence for context-specific SNP effects, thereby underscoring the prevalence of GxE effects on noncoding variant function. We integrate results from molQTL mapping and STARR-seq with eQTL data from HAECs and GTEx tissues to fine-map functional non-coding SNPs at GWAS loci for vascular diseases.
Project description:Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5,832 natural DNA variants in the promoters of 2,503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, consistent with the action of negative selection. Causal variants were enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Project description:Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5,832 natural DNA variants in the promoters of 2,503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, consistent with the action of negative selection. Causal variants were enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Project description:Genetic variants in gene regulatory sequences can modify gene expression and mediate the molecular response to environmental stimuli. In addition, genotype–environment interactions (GxE) contribute to complex traits such as cardiovascular disease. Caffeine is the most widely consumed stimulant and is known to produce a vascular response. To investigate GxE for caffeine, we treated vascular endothelial cells with caffeine and used a massively parallel reporter assay to measure allelic effects on gene regulation for over 43,000 genetic variants. We identified 665 variants with allelic effects on gene regulation and 6 variants that regulate the gene expression response to caffeine (GxE, false discovery rate [FDR] < 5%). When overlapping our GxE results with expression quantitative trait loci colocalized with coronary artery disease and hypertension, we dissected their regulatory mechanisms and showed a modulatory role for caffeine. Our results demonstrate that massively parallel reporter assay is a powerful approach to identify and molecularly characterize GxE in the specific context of caffeine consumption.
Project description:We performed a massively parallel screen in human HAP1 cells to identify loss-of-function missense variants in the key DNA mismatch repair factor MSH2. Resulting variant loss-of-function (LOF) scores are strongly concordant with previous functional evidence and available variant classification.
Project description:We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter-gene expression assay. As demonstration, we have measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment. We found that, in agreement with previous reports, most genetic variants have weak effects on distal regulatory element activity. Because haplotypes are typically maintained within but not between assayed regulatory elements, the approach can be used to identify likely causal regulatory haplotypes that contribute to human phenotypes. Finally, we demonstrate the utility of the method to functionally fine map causal regulatory variants in regions of high linkage disequilibrium identified by expression quantitative trait loci (eQTL) analyses. 104 candidate regulatory elements from 95 individuals were resequenced using Illumina custom amplicon sequencing. We then cloned the resulting DNA fragments into a massively parallel reporter assay to quantify allele-specific regulatory activity from that population. SNP-fdr.txt contains output of significance evaluation haplotype.fasta.gz contains the reference used to generate alignment files