Project description:BackgroundThe increasing availability of multiple types of genomic profiles measured from the same cancer patients has provided numerous opportunities for investigating genomic mechanisms underlying cancer. In particular, association studies of gene expression traits with respect to multi-layered genomic features are highly useful for uncovering the underlying mechanism. Conventional correlation-based association tests are limited because they are prone to revealing indirect associations. Moreover, integration of multiple types of genomic features raises another challenge.MethodsIn this study, we propose a new framework for association studies called integrative regression network that identifies genomic associations on multiple high-dimensional genomic profiles by taking into account the associations between as well as within profiles. We employed high-dimensional regression techniques to first identify the associations between different genomic profiles. Based on the resulting regression coefficients, a regression network was constructed within each profile. For example, two methylation features having similar regression coefficients with respect to a number of gene expression traits are likely to be involved in the same biological process and therefore we define an edge between two methylation features in the regression network. To extract more reliable associations, multiple sparse structured regression techniques were applied and the resulting multiple networks were merged as the integrative regression network using a similarity network fusion technique.ResultsExperiments were carried out using four different sparse structured regression methods on five cancer types from TCGA. The advantages and disadvantages of each regression method were also explored. We find there was large inconsistency in the results from different regression methods, which supports the need to extract the proposed integrative regression network from multiple complimentary regression techniques. Fusing multiple regression networks by using similarity measurements led to the identification of significant gene pairs and a resulting network with better topological properties.ConclusionsWe developed and validated the integrative regression network scheme on multi-layered genomic profiles from TCGA. Our method facilitates identification of the strong signals as well as weaker signals by fusing information from different regression techniques. It could be extended to integrate results obtained from different cancer types as well.

Project description:BackgroundEnhancers are distal cis-regulatory elements required for cell-specific gene expression and cell fate determination. In cancer, enhancer variation has been proposed as a major cause of inter-patient heterogeneity-however, most predicted enhancer regions remain to be functionally tested.MethodsWe analyzed 132 epigenomic histone modification profiles of 18 primary gastric cancer (GC) samples, 18 normal gastric tissues, and 28 GC cell lines using Nano-ChIP-seq technology. We applied Capture-based Self-Transcribing Active Regulatory Region sequencing (CapSTARR-seq) to assess functional enhancer activity. An Activity-by-contact (ABC) model was employed to explore the effects of histone acetylation and CapSTARR-seq levels on enhancer-promoter interactions.ResultsWe report a comprehensive catalog of 75,730 recurrent predicted enhancers, the majority of which are GC-associated in vivo (> 50,000) and associated with lower somatic mutation rates inferred by whole-genome sequencing. Applying CapSTARR-seq to the enhancer catalog, we observed significant correlations between CapSTARR-seq functional activity and H3K27ac/H3K4me1 levels. Super-enhancer regions exhibited increased CapSTARR-seq signals compared to regular enhancers, even when decoupled from native chromatin contexture. We show that combining histone modification and CapSTARR-seq functional enhancer data improves the prediction of enhancer-promoter interactions and pinpointing of germline single nucleotide polymorphisms (SNPs), somatic copy number alterations (SCNAs), and trans-acting TFs involved in GC expression. We identified cancer-relevant genes (ING1, ARL4C) whose expression between patients is influenced by enhancer differences in genomic copy number and germline SNPs, and HNF4α as a master trans-acting factor associated with GC enhancer heterogeneity.ConclusionsOur results indicate that combining histone modification and functional assay data may provide a more accurate metric to assess enhancer activity than either platform individually, providing insights into the relative contribution of genetic (cis) and regulatory (trans) mechanisms to GC enhancer functional heterogeneity.

Dataset Information

Integrative Functional and 3D Genomic Analysis Reveals An Enhancer Network Activity at the PTEN Locus

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets