We need your help! If you've ever found our data helpful, please take our impact survey (15 min). Your replies will help keep the data flowing to the scientific community. Please Click here for Survey
Omics score: 0
Inference of hierarchical regulatory network of estrogen-dependent breast cancer through ChIP-based data.
ABSTRACT: Global profiling of in vivo protein-DNA interactions using ChIP-based technologies has evolved rapidly in recent years. Although many genome-wide studies have identified thousands of ER? binding sites and have revealed the associated transcription factor (TF) partners, such as AP1, FOXA1 and CEBP, little is known about ER? associated hierarchical transcriptional regulatory networks.In this study, we applied computational approaches to analyze three public available ChIP-based datasets: ChIP-seq, ChIP-PET and ChIP-chip, and to investigate the hierarchical regulatory network for ER? and ER? partner TFs regulation in estrogen-dependent breast cancer MCF7 cells. 16 common TFs and two common new TF partners (RORA and PITX2) were found among ChIP-seq, ChIP-chip and ChIP-PET datasets. The regulatory networks were constructed by scanning the ChIP-peak region with TF specific position weight matrix (PWM). A permutation test was performed to test the reliability of each connection of the network. We then used DREM software to perform gene ontology function analysis on the common genes. We found that FOS, PITX2, RORA and FOXA1 were involved in the up-regulated genes.We also conducted the ER? and Pol-II ChIP-seq experiments in tamoxifen resistance MCF7 cells (denoted as MCF7-T in this study) and compared the difference between MCF7 and MCF7-T cells. The result showed very little overlap between these two cells in terms of targeted genes (21.2% of common genes) and targeted TFs (25% of common TFs). The significant dissimilarity may indicate totally different transcriptional regulatory mechanisms between these two cancer cells.Our study uncovers new estrogen-mediated regulatory networks by mining three ChIP-based data in MCF7 cells and ChIP-seq data in MCF7-T cells. We compared the different ChIP-based technologies as well as different breast cancer cells. Our computational analytical approach may guide biologists to further study the underlying mechanisms in breast cancer cells or other human diseases.
Project description:The TCF7L2 transcription factor (TF) is a member of Wnt signalling pathway, and may influence transcription of several genes by binding to distinct regulatory regions. Genome-wide studies have identified thousands of TCF7L2 binding sites and have revealed some associated TF partners. However, there is still a large uncharted region in the hierarchical regulatory network for TCF7L2 and the partner TFs in MCF7 cells. We analysed ChIP-seq data by searching for motifs in the enriched peak region based on TF-specific position weight matrix (PWM). We found association of FOXO1 and CAD with up-regulated genes, AP2?, PBF and AP1 with down-regulated genes. TCF7L2 and GATA3 were found to be associated with both up and down-regulated genes. Our study uncovers new TCF7L2 associated regulatory networks by mining ChIP-seq data in MCF7 cell, which may contribute to further study of the mechanisms related to Wnt pathway in breast cancer or other diseases.
Project description:Chromatin immunoprecipitation followed by next-generation DNA sequencing (ChIP-seq) is a widely used technique for identifying transcription factor (TF) binding events throughout an entire genome. However, ChIP-seq is limited by the availability of suitable ChIP-seq grade antibodies, and the vast majority of commercially available antibodies fail to generate usable data sets. To ameliorate these technical obstacles, we present a robust methodological approach for performing ChIP-seq through epitope tagging of endogenous TFs. We used clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-based genome editing technology to develop CRISPR epitope tagging ChIP-seq (CETCh-seq) of DNA-binding proteins. We assessed the feasibility of CETCh-seq by tagging several DNA-binding proteins spanning a wide range of endogenous expression levels in the hepatocellular carcinoma cell line HepG2. Our data exhibit strong correlations between both replicate types as well as with standard ChIP-seq approaches that use TF antibodies. Notably, we also observed minimal changes to the cellular transcriptome and to the expression of the tagged TF. To examine the robustness of our technique, we further performed CETCh-seq in the breast adenocarcinoma cell line MCF7 as well as mouse embryonic stem cells and observed similarly high correlations. Collectively, these data highlight the applicability of CETCh-seq to accurately define the genome-wide binding profiles of DNA-binding proteins, allowing for a straightforward methodology to potentially assay the complete repertoire of TFs, including the large fraction for which ChIP-quality antibodies are not available.
Project description:Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the most popular assay to identify genomic regions, called ChIP-seq peaks, that are bound in vivo by transcription factors (TFs). These regions are derived from direct TF-DNA interactions, indirect binding of the TF to the DNA (through a co-binding partner), nonspecific binding to the DNA, and noise/bias/artifacts. Delineating the bona fide direct TF-DNA interactions within the ChIP-seq peaks remains challenging. We developed a dedicated software, ChIP-eat, that combines computational TF binding models and ChIP-seq peaks to automatically predict direct TF-DNA interactions. Our work culminated with predicted interactions covering >4% of the human genome, obtained by uniformly processing 1983 ChIP-seq peak data sets from the ReMap database for 232 unique TFs. The predictions were a posteriori assessed using protein binding microarray and ChIP-exo data, and were predominantly found in high quality ChIP-seq peaks. The set of predicted direct TF-DNA interactions suggested that high-occupancy target regions are likely not derived from direct binding of the TFs to the DNA. Our predictions derived co-binding TFs supported by protein-protein interaction data and defined cis-regulatory modules enriched for disease- and trait-associated SNPs. We provide this collection of direct TF-DNA interactions and cis-regulatory modules through the UniBind web-interface (http://unibind.uio.no).
Project description:The Plant Promoter Analysis Navigator (PlantPAN; http://PlantPAN.itps.ncku.edu.tw/) is an effective resource for predicting regulatory elements and reconstructing transcriptional regulatory networks for plant genes. In this release (PlantPAN 3.0), 17 230 TFs were collected from 78 plant species. To explore regulatory landscapes, genomic locations of TFBSs have been captured from 662 public ChIP-seq samples using standard data processing. A total of 1 233 999 regulatory linkages were identified from 99 regulatory factors (TFs, histones and other DNA-binding proteins) and their target genes across seven species. Additionally, this new version added 2449 matrices extracted from ChIP-seq peaks for cis-regulatory element prediction. In addition to integrated ChIP-seq data, four major improvements were provided for more comprehensive information of TF binding events, including (i) 1107 experimentally verified TF matrices from the literature, (ii) gene regulation network comparison between two species, (iii) 3D structures of TFs and TF-DNA complexes and (iv) condition-specific co-expression networks of TFs and their target genes extended to four species. The PlantPAN 3.0 can not only be efficiently used to investigate critical cis- and trans-regulatory elements in plant promoters, but also to reconstruct high-confidence relationships among TF-targets under specific conditions.
Project description:BACKGROUND: Genetic alterations of transcription factors (TFs) have been implicated in the tumorigenesis of cancers. In many cancers, alteration of TFs results in aberrant activity of them without changing their gene expression level. Gene expression data from microarray or RNA-seq experiments can capture the expression change of genes, however, it is still challenge to reveal the activity change of TFs. RESULTS: Here we propose a method, called REACTIN (REgulatory ACTivity INference), which integrates TF binding data with gene expression data to identify TFs with significantly differential activity between disease and normal samples. REACTIN successfully detect differential activity of estrogen receptor (ER) between ER+ and ER- samples in 10 breast cancer datasets. When applied to compare tumor and normal breast samples, it reveals TFs that are critical for carcinogenesis of breast cancer. Moreover, Reaction can be utilized to identify transcriptional programs that are predictive to patient survival time of breast cancer patients. CONCLUSIONS: REACTIN provides a useful tool to investigate regulatory programs underlying a biological process providing the related case and control gene expression data. Considering the enormous amount of cancer gene expression data and the increasingly accumulating ChIP-seq data, we expect wide application of REACTIN for revealing the regulatory mechanisms of various diseases.
Project description:Understanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific community to conduct TF ChIP-seq experiments, the available data represent only a limited percentage of ChIP-seq experiments, considering all possible combinations of TFs and cell lines. In this study, we demonstrate a method for accurately predicting cell-specific TF binding for TF-cell line combinations based on only a small fraction (4%) of the combinations using available ChIP-seq data. The proposed model, termed TFImpute, is based on a deep neural network with a multi-task learning setting to borrow information across transcription factors and cell lines. Compared with existing methods, TFImpute achieves comparable accuracy on TF-cell line combinations with ChIP-seq data; moreover, TFImpute achieves better accuracy on TF-cell line combinations without ChIP-seq data. This approach can predict cell line specific enhancer activities in K562 and HepG2 cell lines, as measured by massively parallel reporter assays, and predicts the impact of SNPs on TF binding.
Project description:We report the ER alpha regulatory network of Tamoxifen resistance MCF7 cell line using the Chromatin immunoprecipitated high-throughput sequencing technology (ChIP-seq). By Integrating the gene expression data (previously reported) with the ChIP-seq data, we generated ER alpha regulatory network and pathways. For ER alpha regulatory network, hub TFs with enriched motifs were identified from ER alpha peak together with PolII peaks. We then scan the position weight matrix (PWM) of ER alpha peak region of certain gene to find out the regulatory relationship between hub TF and normal TF. For regulatory pathway, genes were grouped base on their expression value at 4 different time point. Then the hub TF that plays important role in each time point of each group was identified. This study provides a framework for the application of ChIP-seq and gene expression data for the construction of ER alpha regulatory network. 4 different ChIP-seq dataset in Tamoxifen resistance MCF7 cell line
Project description:BACKGROUND:How transcription factors (TFs) interact with cis-regulatory sequences and interact with each other is a fundamental, but not well understood, aspect of gene regulation. METHODOLOGY/PRINCIPAL FINDINGS:We present a computational method to address this question, relying on the established biophysical principles. This method, STAP (sequence to affinity prediction), takes into account all combinations and configurations of strong and weak binding sites to analyze large scale transcription factor (TF)-DNA binding data to discover cooperative interactions among TFs, infer sequence rules of interaction and predict TF target genes in new conditions with no TF-DNA binding data. The distinctions between STAP and other statistical approaches for analyzing cis-regulatory sequences include the utility of physical principles and the treatment of the DNA binding data as quantitative representation of binding strengths. Applying this method to the ChIP-seq data of 12 TFs in mouse embryonic stem (ES) cells, we found that the strength of TF-DNA binding could be significantly modulated by cooperative interactions among TFs with adjacent binding sites. However, further analysis on five putatively interacting TF pairs suggests that such interactions may be relatively insensitive to the distance and orientation of binding sites. Testing a set of putative Nanog motifs, STAP showed that a novel Nanog motif could better explain the ChIP-seq data than previously published ones. We then experimentally tested and verified the new Nanog motif. A series of comparisons showed that STAP has more predictive power than several state-of-the-art methods for cis-regulatory sequence analysis. We took advantage of this power to study the evolution of TF-target relationship in Drosophila. By learning the TF-DNA interaction models from the ChIP-chip data of D. melanogaster (Mel) and applying them to the genome of D. pseudoobscura (Pse), we found that only about half of the sequences strongly bound by TFs in Mel have high binding affinities in Pse. We show that prediction of functional TF targets from ChIP-chip data can be improved by using the conservation of STAP predicted affinities as an additional filter. CONCLUSIONS/SIGNIFICANCE:STAP is an effective method to analyze binding site arrangements, TF cooperativity, and TF target genes from genome-wide TF-DNA binding data.
Project description:We report the ER alpha regulatory network of Tamoxifen resistance MCF7 cell line using the Chromatin immunoprecipitated high-throughput sequencing technology (ChIP-seq). By Integrating the gene expression data (previously reported) with the ChIP-seq data, we generated ER alpha regulatory network and pathways. For ER alpha regulatory network, hub TFs with enriched motifs were identified from ER alpha peak together with PolII peaks. We then scan the position weight matrix (PWM) of ER alpha peak region of certain gene to find out the regulatory relationship between hub TF and normal TF. For regulatory pathway, genes were grouped base on their expression value at 4 different time point. Then the hub TF that plays important role in each time point of each group was identified. This study provides a framework for the application of ChIP-seq and gene expression data for the construction of ER alpha regulatory network. Overall design: 4 different ChIP-seq dataset in Tamoxifen resistance MCF7 cell line
Project description:ChIP-seq reveals genomic regions where proteins, e.g. transcription factors (TFs) interact with DNA. A substantial fraction of these regions, however, do not contain the cognate binding site for the TF of interest. This phenomenon might be explained by protein-protein interactions and co-precipitation of interacting gene regulatory elements. We uniformly processed 3727 human ChIP-seq data sets and determined the cistrome of 292 TFs, as well as the distances between the TF binding motif centers and the ChIP-seq peak summits. ChIPSummitDB enables the analysis of ChIP-seq data using multiple approaches. The 292 cistromes and corresponding ChIP-seq peak sets can be browsed in GenomeView. Overlapping SNPs can be inspected in dbSNPView. Most importantly, the MotifView and PairShiftView pages show the average distance between motif centers and overlapping ChIP-seq peak summits and distance distributions thereof, respectively. In addition to providing a comprehensive human TF binding site collection, the ChIPSummitDB database and web interface allows for the examination of the topological arrangement of TF complexes genome-wide. ChIPSummitDB is freely accessible at http://summit.med.unideb.hu/summitdb/. The database will be regularly updated and extended with the newly available human and mouse ChIP-seq data sets.