Project description:Most of the millions of single-nucleotide polymorphisms (SNPs) in the human genome are non-coding, and many overlap with putative regulatory elements. Genome-wide association studies have linked many of these SNPs to human traits or to gene expression levels, but rarely with sufficient resolution to identify the causal SNPs. Functional screens based on reporter assays have previously been of insufficient throughput to test the vast space of SNPs for possible effects on enhancer and promoter activity. Here, we have leveraged the throughput of the SuRE reporter technology to survey a total of 5.9 million SNPs, including 57% of the known common SNPs world-wide. We identified more than 30 thousand SNPs that alter the activity of putative regulatory elements, often in a cell-type specific manner. These data indicate that a large proportion of human non-coding SNPs may affect gene regulation. Integration of these SuRE data with genome-wide association studies may help to identify causal SNPs.
Project description:Background: Expression quantitative trait loci (eQTL) studies are a valuable approach for identifying genetic variants correlated with gene expression. However, identifying the causal variants is challenging due to linkage disequilibrium amongst variants in the same haplotype block. In this study, we aim to identify functional SNPs in key regulatory regions that alter transcriptional regulation and thus, potentially impact cellular function. The majority of disease-associated single-nucleotide polymorphisms (SNPs) are located in regulatory regions, which can result in allele-specific binding (ASB) of transcription factors and differential expression of the target gene alleles. Here, we present regSNPs-ASB, a generalized linear model-based approach to accurately identify regulatory SNPs that are located in transcription factor binding sites from ATAC-seq data. Results: Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression (ASE). By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also eQTLs. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In sum, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets.
Project description:We adapted the DiR barcode-based parallel reporter assay systems strategy to systematically identify the SNPs that affect gene expression by modulating activities of regulatory elements. Among 293 SNPs linked with GWAS-identified prostate cancer-risk SNPs, we found 32, 9, and 11 regulatory SNPs in 22Rv1, PC-3, and LNCaP cells. Further mechanism study indicates that one SNP regulates gene expression in prostate cancer malignancy. The DiR system has great potential to advance the functional study of risk SNPs that have associations with polygenic diseases. Our findings hold great promise in benefiting prostate cancer patients with prognostic prediction.
Project description:The majority of single nucleotide polymorphisms (SNPs) associated with insulin resistance (IR)-relevant phenotypes by genome-wide association studies (GWASs) are located in noncoding regions, complicating their functional interpretation. Here, we utilized an adapted STARR-seq to systematically detect regulatory activity of 5,987 noncoding GWASs SNPs in three IR-relevant cell lines. We identified 876 SNPs with biased allelic enhancer activity across 133 loci, and further uncovered the genetic regulatory mechanisms underlying functional SNPs through integrating multi-omics analyses.
Project description:Background & Aims: The contribution of genetics to the pathogenesis of inflammatory bowel disease (IBD) has been established by twin studies, targeted sequencing and genome-wide association studies (GWASs). This has yielded a plethora of risk loci with an aim to identify causal variants. Research on the genetic components of IBD has mainly focused on protein coding genes, thereby omitting other functional elements in the human genome i.e. the regulatory regions. Methods: Using acetylated histone 3 lysine 27 (H3K27ac) chromatin immunoprecipitation and sequencing (ChIP-seq), we identified tens of thousands of potential regulatory regions that are active in intestinal epithelium and immune cells, the main cell types involved in IBD. We correlated these regions with susceptibility loci for IBD. Results: We show that 45 out of 163 single nucleotide polymorphisms (SNPs) associated with IBD co-localize with active regulatory elements. In addition, another 47 IBD associated SNPs co-localize with active regulatory element via other SNP in strong linkage disequilibrium. Altogether 92 out of 163 IBD-associated SNPs can be connected with distinct regulatory element. This is 2.5 to 3.5 times more frequent than expected from random sampling. The genomic variation in these SNPs often creates or disrupts known binding motifs - thereby possibly affecting the binding affinity of transcriptional regulators and altering the expression of regulated genes. Conclusions: We show that in addition to protein coding genes, non-coding DNA regulatory regions, active in immune cells and in intestinal epithelium, are involved in IBD. H3K27ac ChIP-seq (ab4729, Abcam) profile of 7 intestinal epithelial samples