Transcriptomics

Dataset Information

0

Deciphering Regulatory SNPs from ATAC-seq


ABSTRACT: Background: Expression quantitative trait loci (eQTL) studies are a valuable approach for identifying genetic variants correlated with gene expression. However, identifying the causal variants is challenging due to linkage disequilibrium amongst variants in the same haplotype block. In this study, we aim to identify functional SNPs in key regulatory regions that alter transcriptional regulation and thus, potentially impact cellular function. The majority of disease-associated single-nucleotide polymorphisms (SNPs) are located in regulatory regions, which can result in allele-specific binding (ASB) of transcription factors and differential expression of the target gene alleles. Here, we present regSNPs-ASB, a generalized linear model-based approach to accurately identify regulatory SNPs that are located in transcription factor binding sites from ATAC-seq data. Results: Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression (ASE). By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also eQTLs. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In sum, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets.

ORGANISM(S): Homo sapiens

PROVIDER: GSE145245 | GEO | 2020/02/14

REPOSITORIES: GEO

Similar Datasets

2016-01-21 | E-GEOD-77052 | biostudies-arrayexpress
2019-11-28 | GSE140553 | GEO
2020-03-27 | GSE147628 | GEO
2014-06-23 | E-GEOD-53351 | biostudies-arrayexpress
2020-06-02 | GSE139377 | GEO
2018-04-28 | E-MTAB-6666 | biostudies-arrayexpress
2014-06-23 | GSE53351 | GEO
2016-02-09 | E-GEOD-77688 | biostudies-arrayexpress
2011-06-27 | E-GEOD-28893 | biostudies-arrayexpress
2018-04-28 | E-MTAB-6667 | biostudies-arrayexpress