Project description:Osteoporosis is a skeletal disorder characterized by low bone mineral density (BMD) and deterioration of bone microarchitecture. To identify novel genetic loci underlying osteoporosis, an effective strategy is to focus on scanning of variants with high potential functional impacts. Enhancers play a crucial role in regulating cell-type-specific transcription. Therefore, single-nucleotide polymorphisms (SNPs) located in enhancers (enhancer-SNPs) may represent strong candidate functional variants. Here, we performed a targeted analysis for potential functional enhancer-SNPs that may affect gene expression and biological processes in bone-related cells, specifically, osteoblasts, and peripheral blood monocytes (PBMs), using five independent cohorts (n = 5905) and the genetics factors for osteoporosis summary statistics, followed by comprehensive integrative genomic analyses of chromatin states, transcription, and metabolites. We identified 15 novel enhancer-SNPs associated with femoral neck and lumbar spine BMD, including 5 SNPs mapped to novel genes (e.g., rs10840343 and rs10770081 in IGF2 gene) and 10 novel SNPs mapped to known BMD-associated genes (e.g., rs2941742 in ESR1 gene, and rs10249092 and rs4342522 in SHFM1 gene). Interestingly, enhancer-SNPs rs10249092 and rs4342522 in SHFM1 were tightly linked, but annotated to different enhancers in PBMs and osteoblasts, respectively, suggesting that even tightly linked SNPs may regulate the same target gene and contribute to the phenotype variation in cell-type-specific manners. Importantly, ten enhancer-SNPs may also regulate BMD variation by affecting the serum metabolite levels. Our findings revealed novel susceptibility loci that may regulate BMD variation and provided intriguing insights into the genetic mechanisms of osteoporosis.
Project description:Enhancers possess both structural elements mediating promoter looping and functional elements mediating gene expression. Traditional models of enhancer-mediated gene regulation imply genomic overlap or immediate adjacency of these elements. We test this model by combining densely-tiled CRISPRa screening with nucleosome-resolution Region Capture Micro-C topology analysis. Using this integrated approach, we comprehensively define the cis-regulatory landscape for the tumor suppressor PTEN, identifying and validating 10 distinct enhancers and defining their 3D spatial organization. Unexpectedly, we identify several long-range functional enhancers whose promoter proximity is facilitated by chromatin loop anchors several kilobases away, and demonstrate that accounting for this spatial separation improves the computational prediction of validated enhancers. Thus, we propose a new model of enhancer organization incorporating spatial separation of essential functional and structural components.
Project description:Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available literature and ChIP-seq and ChIP-chip data for enhanceosome factor p300 and the transcriptional regulator Gli3, we found that genomic proximity and conserved synteny predict target genes with a relatively low recall of 12-27% within 2 Mb intervals centered at the enhancers. Here, we show that functional similarities between enhancer binding proteins and their transcriptional targets and proximity in the protein-protein interactome improve prediction of target genes. We used all four features to train random forest classifiers that predict target genes with a recall of 58% in 2 Mb intervals that may contain dozens of genes, representing a better than two-fold improvement over the performance of prediction based on single features alone. Genome-wide ChIP data is still relatively poorly understood, and it remains difficult to assign biological significance to binding events. Our study represents a first step in integrating various genomic features in order to elucidate the genomic network of long-range regulatory interactions.
Project description:Genome-wide association studies (GWAS) are identifying genetic predisposition to various diseases. The 17q24.3 locus harbors the single nucleotide polymorphism (SNP) rs1859962 that is statistically associated with prostate cancer (PCa). It defines a 130-kb linkage disequilibrium (LD) block that lies in an ?2-Mb gene desert area. The functional biology driving the risk associated with this LD block is unknown. Here, we integrate genome-wide chromatin landscape data sets, namely, epigenomes and chromatin openness from diverse cell types. This identifies a PCa-specific enhancer within the rs1859962 risk LD block that establishes a 1-Mb chromatin loop with the SOX9 gene. The rs8072254 and rs1859961 SNPs mapping to this enhancer impose allele-specific gene expression. The variant allele of rs8072254 facilitates androgen receptor (AR) binding driving increased enhancer activity. The variant allele of rs1859961 decreases FOXA1 binding while increasing AP-1 binding. The latter is key to imposing allele-specific gene expression. The rs8072254 variant in strong LD with the rs1859962 risk SNP can account for the risk associated with this locus, while rs1859961 is a rare variant less likely to contribute to the risk associated with this LD block. Together, our results demonstrate that multiple genetic variants mapping to a unique enhancer looping to the SOX9 oncogene can account for the risk associated with the PCa 17q24.3 locus. Allele-specific recruitment of the transcription factors androgen receptor (AR) and activating protein-1 (AP-1) account for the increased enhancer activity ascribed to this PCa-risk LD block. This further supports the notion that an integrative genomics approach can identify the functional biology disrupted by genetic risk variants.
Project description:BackgroundThe increasing availability of multiple types of genomic profiles measured from the same cancer patients has provided numerous opportunities for investigating genomic mechanisms underlying cancer. In particular, association studies of gene expression traits with respect to multi-layered genomic features are highly useful for uncovering the underlying mechanism. Conventional correlation-based association tests are limited because they are prone to revealing indirect associations. Moreover, integration of multiple types of genomic features raises another challenge.MethodsIn this study, we propose a new framework for association studies called integrative regression network that identifies genomic associations on multiple high-dimensional genomic profiles by taking into account the associations between as well as within profiles. We employed high-dimensional regression techniques to first identify the associations between different genomic profiles. Based on the resulting regression coefficients, a regression network was constructed within each profile. For example, two methylation features having similar regression coefficients with respect to a number of gene expression traits are likely to be involved in the same biological process and therefore we define an edge between two methylation features in the regression network. To extract more reliable associations, multiple sparse structured regression techniques were applied and the resulting multiple networks were merged as the integrative regression network using a similarity network fusion technique.ResultsExperiments were carried out using four different sparse structured regression methods on five cancer types from TCGA. The advantages and disadvantages of each regression method were also explored. We find there was large inconsistency in the results from different regression methods, which supports the need to extract the proposed integrative regression network from multiple complimentary regression techniques. Fusing multiple regression networks by using similarity measurements led to the identification of significant gene pairs and a resulting network with better topological properties.ConclusionsWe developed and validated the integrative regression network scheme on multi-layered genomic profiles from TCGA. Our method facilitates identification of the strong signals as well as weaker signals by fusing information from different regression techniques. It could be extended to integrate results obtained from different cancer types as well.
Project description:Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.
Project description:Super-enhancers are an emerging subclass of regulatory regions controlling cell identity and disease genes. However, their biological function and impact on miRNA networks are unclear. Here, we report that super-enhancers drive the biogenesis of master miRNAs crucial for cell identity by enhancing both transcription and Drosha/DGCR8-mediated primary miRNA (pri-miRNA) processing. Super-enhancers, together with broad H3K4me3 domains, shape a tissue-specific and evolutionarily conserved atlas of miRNA expression and function. CRISPR/Cas9 genomics revealed that super-enhancer constituents act cooperatively and facilitate Drosha/DGCR8 recruitment and pri-miRNA processing to boost cell-specific miRNA production. The BET-bromodomain inhibitor JQ1 preferentially inhibits super-enhancer-directed cotranscriptional pri-miRNA processing. Furthermore, super-enhancers are characterized by pervasive interaction with DGCR8/Drosha and DGCR8/Drosha-regulated mRNA stability control, suggesting unique RNA regulation at super-enhancers. Finally, super-enhancers mark multiple miRNAs associated with cancer hallmarks. This study presents principles underlying miRNA biology in health and disease and an unrecognized higher-order property of super-enhancers in RNA processing beyond transcription.
Project description:BackgroundEnhancers are distal cis-regulatory elements required for cell-specific gene expression and cell fate determination. In cancer, enhancer variation has been proposed as a major cause of inter-patient heterogeneity-however, most predicted enhancer regions remain to be functionally tested.MethodsWe analyzed 132 epigenomic histone modification profiles of 18 primary gastric cancer (GC) samples, 18 normal gastric tissues, and 28 GC cell lines using Nano-ChIP-seq technology. We applied Capture-based Self-Transcribing Active Regulatory Region sequencing (CapSTARR-seq) to assess functional enhancer activity. An Activity-by-contact (ABC) model was employed to explore the effects of histone acetylation and CapSTARR-seq levels on enhancer-promoter interactions.ResultsWe report a comprehensive catalog of 75,730 recurrent predicted enhancers, the majority of which are GC-associated in vivo (> 50,000) and associated with lower somatic mutation rates inferred by whole-genome sequencing. Applying CapSTARR-seq to the enhancer catalog, we observed significant correlations between CapSTARR-seq functional activity and H3K27ac/H3K4me1 levels. Super-enhancer regions exhibited increased CapSTARR-seq signals compared to regular enhancers, even when decoupled from native chromatin contexture. We show that combining histone modification and CapSTARR-seq functional enhancer data improves the prediction of enhancer-promoter interactions and pinpointing of germline single nucleotide polymorphisms (SNPs), somatic copy number alterations (SCNAs), and trans-acting TFs involved in GC expression. We identified cancer-relevant genes (ING1, ARL4C) whose expression between patients is influenced by enhancer differences in genomic copy number and germline SNPs, and HNF4α as a master trans-acting factor associated with GC enhancer heterogeneity.ConclusionsOur results indicate that combining histone modification and functional assay data may provide a more accurate metric to assess enhancer activity than either platform individually, providing insights into the relative contribution of genetic (cis) and regulatory (trans) mechanisms to GC enhancer functional heterogeneity.
Project description:We identified and characterized multiple cell-type selective enhancers of the CFTR gene promoter in previous work and demonstrated active looping of these elements to the promoter. Here we address the impact of genomic spacing on these enhancer:promoter interactions and on CFTR gene expression. Using CRISPR/Cas9, we generated clonal cell lines with deletions between the -35 kb airway enhancer and the CFTR promoter in the 16HBE14o- airway cell line, or between the intron 1 (185 + 10 kb) intestinal enhancer and the promoter in the Caco2 intestinal cell line. The effect of these deletions on CFTR transcript abundance, as well as the 3D looping structure of the locus was investigated in triplicate clones of each modification. Our results indicate that both small and larger deletions upstream of the promoter can perturb CFTR expression and -35 kb enhancer:promoter interactions in the airway cells, though the larger deletions are more impactful. In contrast, the small intronic deletions have no effect on CFTR expression and intron 1 enhancer:promoter interactions in the intestinal cells, whereas larger deletions do. Clonal variation following a specific CFTR modification is a confounding factor particularly in 16HBE14o- cells.
Project description:Genome-wide association studies (GWAS) are identifying genetic predisposition to various diseases. The rs1859962 single nucleotide polymorphism (SNP) part of the 17q24.3 locus is a risk factor for prostate cancer (PCa). It defines a 130kb linkage disequilibrium (LD) block that lies in a ~2Mb gene desert area. Despite a role for the proximal SOX9 gene in PCa development, the functional biology driving the risk of this 17q24.3 risk locus is unknown. In the present study, we integrate genome-wide chromatin landscape datasets, namely epigenomes and chromatin openness from diverse cell-types to identify one PCa specific enhancer within the rs1859962 risk LD block. We reveal that this enhancer is part of a 1Mb chromatin loop with the SOX9 gene in PCa cells. The rs8072254 and rs1859961 SNPs part of this LD block map to this enhancer and impose allele-specific gene expression. The variant allele of rs1859961 directly decreases FoxA1 binding while increasing AP-1 binding compared to the reference allele. This latter is key in driving allele-specific gene expression. Together, our results demonstrate the risk associated with the PCa rs1859962 risk LD block is accounted for by multiple genetic variants mapping to a unique enhancer looping to the SOX9 oncogene. Allele-specific recruitment of the transcription factor AP-1 accounts in part for the increased enhancer activity ascribed to this PCa risk LD block. This further demonstrates that an integrative genomics approach can identify the functional biology disrupted by genetic risk-variants. Examination of histone modification H3K36me3 in the prostate cancer LNCaP cell line under DHT treatment.