Project description:We adapted the self-transcribing active regulatory region sequencing (Starr-seq) strategy to systematically identify the SNPs that affect gene expression by modulating activities of regulatory elements. Among 10,673 SNPs linked with 996 GWAS-identified cancer-risk SNPs, we found 70 regulatory variants for which the two alleles conferred different regulatory activities. We analyzed one of them in-depth and confirmed its target by CRIPSR-Cas9 technology. Our results will help the interpretation of GWAS results and better cancer risk assessment.
Project description:Genome-wide association studies (GWAS) have successfully identified 145 genomic regions that contribute to schizophrenia risk, but linkage disequilibrium (LD) makes it challenging to discern causal variants. Computational finemapping prioritized thousands of credible variants, ~98% of which lie within poorly characterized non-coding regions. To functionally validate their regulatory effects, we performed a massively parallel reporter assay (MPRA) on 5,173 finemapped schizophrenia GWAS variants in primary human neural progenitors (HNPs). We identified 440 variants with allelic regulatory effects (MPRA-positive variants), with 72% of GWAS loci containing at least one MPRA-positive variant. Transcription factor binding had modest predictive power for predicting the allelic activity of MPRA-positive variants, while GWAS association, finemap posterior probability, enhancer overlap, and evolutionary conservation failed to predict MPRA-positive variants. Furthermore, 64% of MPRA-positive variants did not exhibit eQTL signature, suggesting that MPRA could identify yet unexplored variants with regulatory potentials. MPRA-positive variants differed from eQTLs, as they were more frequently located in distal neuronal enhancers. Therefore, we leveraged neuronal 3D chromatin architecture to identify 273 genes that physically interact with MPRA-positive variants. These genes annotated by chromatin interactome displayed higher mutational constraints and regulatory complexity than genes annotated by eQTLs, recapitulating a recent finding that eQTL- and GWAS-detected variants map to genes with different properties. Finally, we propose a model in which allelic activity of multiple variants within a GWAS locus can be aggregated to predict gene expression by taking chromatin contact frequency and accessibility into account. In conclusion, we demonstrate that MPRA can effectively identify functional regulatory variants and delineate previously unknown regulatory principles of schizophrenia.
Project description:Eosinophilic esophagitis (EoE) is a rare atopic disorder associated with esophageal dysfunction, including difficulty swallowing, food impaction, and inflammation. EoE develops in a small subset of people with food allergies under the influence of environmental and genetic risk factors. Genome wide association studies (GWAS) have identified 31 independent risk loci for the disease, and linkage disequilibrium (LD) expansion of these loci nominates a set of 531 variants that are potentially causal. These risk variants are non-coding, suggesting a likely role in altered gene regulatory mechanisms. To systematically interrogate the gene regulatory activity of these variants, we designed a massively parallel reporter assay (MPRA) containing the alleles of each variant within their 170 bp genomic sequence context cloned into a GFP reporter library. Transfection of the MPRA library into TE-7 esophageal epithelial cells, HaCaT skin keratinocytes, and Jurkat T cells revealed cell-type-specific gene regulation. We identify 32 allelic enhancer variants (allelic enVars) that regulate reporter gene expression in a genotype-dependent manner in at least one cellular context. By annotating these variants with expression quantitative trait loci (eQTL) and chromatin looping data in related tissues and cell types, we identify putative target genes affected by genetic variation in EoE patients, including TSLP and multiple genes at the HLA locus. Transcription factor enrichment analyses reveal possible roles for cell-type specific regulators, including GATA-3, a key regulator of type 2 inflammation. Collectively, our approach reduces the large set of EoE-associated variants to a set of 32 with allelic regulatory activity, providing new functional insights into the effects of genetic variation in this disease.
Project description:Eosinophilic esophagitis (EoE) is a rare atopic disorder associated with esophageal dysfunction, including difficulty swallowing, food impaction, and inflammation. EoE develops in a small subset of people with food allergies under the influence of environmental and genetic risk factors. Genome wide association studies (GWAS) have identified 31 independent risk loci for the disease, and linkage disequilibrium (LD) expansion of these loci nominates a set of 531 variants that are potentially causal. These risk variants are non-coding, suggesting a likely role in altered gene regulatory mechanisms. To systematically interrogate the gene regulatory activity of these variants, we designed a massively parallel reporter assay (MPRA) containing the alleles of each variant within their 170 bp genomic sequence context cloned into a GFP reporter library. Transfection of the MPRA library into TE-7 esophageal epithelial cells, HaCaT skin keratinocytes, and Jurkat T cells revealed cell-type-specific gene regulation. We identify 32 allelic enhancer variants (allelic enVars) that regulate reporter gene expression in a genotype-dependent manner in at least one cellular context. By annotating these variants with expression quantitative trait loci (eQTL) and chromatin looping data in related tissues and cell types, we identify putative target genes affected by genetic variation in EoE patients, including TSLP and multiple genes at the HLA locus. Transcription factor enrichment analyses reveal possible roles for cell-type specific regulators, including GATA-3, a key regulator of type 2 inflammation. Collectively, our approach reduces the large set of EoE-associated variants to a set of 32 with allelic regulatory activity, providing new functional insights into the effects of genetic variation in this disease.
Project description:To date, genome-wide association studies (GWAS) have revealed over 200 genetic risk loci associated with prostate cancer; yet, true disease-causing variants in gene regulatory regions remain elusive. Identification of causal variants and their targets from association signals relevant to prostate cancer is complicated by high linkage disequilibrium and limited availability of functional genomics data for specific tissue/cell types. Here, we integrated statistical fine-mapping and functional annotation from prostate-specific epigenomic profiles, high resolution 3D genome features, and quantitative trait loci data to distinguish causal variants from associations and identify target genes they regulate. Our fine-mapping analysis yielded 1,892 likely causal variants, and multiscale functional annotation linked them to 406 target genes. We prioritized rs10486567, located in an enhancer, as a genome-wide top-ranked SNP and predicted HOTTIP as its target. Deletion of the rs10486567-associated enhancer in prostate cancer cells decreased their capacity for invasive migration. HOTTIP overexpression in an enhancer-KO cell line rescued defective invasive migration. Furthermore, we found that rs10486567 regulates HOTTIP through allele-specific long- range chromatin interaction.
Project description:To date, genome-wide association studies (GWAS) have revealed over 200 genetic risk loci associated with prostate cancer; yet, true disease-causing variants in gene regulatory regions remain elusive. Identification of causal variants and their targets from association signals relevant to prostate cancer is complicated by high linkage disequilibrium and limited availability of functional genomics data for specific tissue/cell types. Here, we integrated statistical fine-mapping and functional annotation from prostate-specific epigenomic profiles, high resolution 3D genome features, and quantitative trait loci data to distinguish causal variants from associations and identify target genes they regulate. Our fine-mapping analysis yielded 1,892 likely causal variants, and multiscale functional annotation linked them to 406 target genes. We prioritized rs10486567, located in an enhancer, as a genome-wide top-ranked SNP and predicted HOTTIP as its target. Deletion of the rs10486567-associated enhancer in prostate cancer cells decreased their capacity for invasive migration. HOTTIP overexpression in an enhancer-KO cell line rescued defective invasive migration. Furthermore, we found that rs10486567 regulates HOTTIP through allele-specific long- range chromatin interaction.
Project description:To date, genome-wide association studies (GWAS) have revealed over 200 genetic risk loci associated with prostate cancer; yet, true disease-causing variants in gene regulatory regions remain elusive. Identification of causal variants and their targets from association signals relevant to prostate cancer is complicated by high linkage disequilibrium and limited availability of functional genomics data for specific tissue/cell types. Here, we integrated statistical fine-mapping and functional annotation from prostate-specific epigenomic profiles, high resolution 3D genome features, and quantitative trait loci data to distinguish causal variants from associations and identify target genes they regulate. Our fine-mapping analysis yielded 1,892 likely causal variants, and multiscale functional annotation linked them to 406 target genes. We prioritized rs10486567, located in an enhancer, as a genome-wide top-ranked SNP and predicted HOTTIP as its target. Deletion of the rs10486567-associated enhancer in prostate cancer cells decreased their capacity for invasive migration. HOTTIP overexpression in an enhancer-KO cell line rescued defective invasive migration. Furthermore, we found that rs10486567 regulates HOTTIP through allele-specific long- range chromatin interaction.
Project description:To date, genome-wide association studies (GWAS) have revealed over 200 genetic risk loci associated with prostate cancer; yet, true disease-causing variants in gene regulatory regions remain elusive. Identification of causal variants and their targets from association signals relevant to prostate cancer is complicated by high linkage disequilibrium and limited availability of functional genomics data for specific tissue/cell types. Here, we integrated statistical fine-mapping and functional annotation from prostate-specific epigenomic profiles, high resolution 3D genome features, and quantitative trait loci data to distinguish causal variants from associations and identify target genes they regulate. Our fine-mapping analysis yielded 1,892 likely causal variants, and multiscale functional annotation linked them to 406 target genes. We prioritized rs10486567, located in an enhancer, as a genome-wide top-ranked SNP and predicted HOTTIP as its target. Deletion of the rs10486567-associated enhancer in prostate cancer cells decreased their capacity for invasive migration. HOTTIP overexpression in an enhancer-KO cell line rescued defective invasive migration. Furthermore, we found that rs10486567 regulates HOTTIP through allele-specific long- range chromatin interaction.
Project description:Application of Systems Genetics analysis for systematic evaluation of candidate causal genes associated with risk of Type 1 Diabetes along with follow-up bioinformatics pathway analysis. Total RNA obtained from whole blood (for EBV-B cells) and Peripheral blood mononuclear cells (PBMCs) (for T-cells)