Cell-type specific and combinatorial usage of diverse transcription factors revealed by genome-wide binding studies in multiple human cells
Ontology highlight
ABSTRACT: This SuperSeries is composed of the following subset Series: GSE19622: Individual-specific and allele-specific chromatin signatures in diverse human populations GSE25416: High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells (ChIP-seq) GSE30226: Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity [ChIP_seq]. GSE32692: Cell-type specific and combinatorial usage of diverse transcription factors revealed by genome-wide binding studies in multiple human cells [new ChIP-Seq samples] For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Refer to individual Series
Project description:This dataset contains DNase-seq data and CTCF ChIP-seq data for 6 lymphoblastoid cell lines. There are 3 cell lines from a YRI trio and 3 lines from a CEU trio (HapMap GM19238, GM19239, GM 19240, GM12891, GM12892, GM12878). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq and ChIP-seq data from each of the 6 cell lines.
Project description:This SuperSeries is composed of the following subset Series: GSE25344: High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells (Dnase-seq) GSE25416: High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells (ChIP-seq) For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Refer to individual Series
Project description:The human body contains thousands of unique cell types, each with specialized functions. Cell identity is governed in large part by gene transcription programs, which are determined by regulatory elements encoded in DNA. To identify regulatory elements active in seven cell lines representative of diverse human cell types, we used DNase-seq and FAIRE-seq to map M-CM-"M-BM-^@M-BM-^\open chromatinM-CM-"M-BM-^@M-BM-^]. Over 870,000 DNaseI or FAIRE sites, which correspond largely to nucleosome depleted regions (NDRs), were identified across the seven cell lines, covering nearly 9% of the genome. The combination of DNaseI and FAIRE is more effective than either assay alone in identifying likely regulatory elements, as judged by coincidence with transcription factor binding locations determined in the same cells. Open chromatin common to all seven cell types tended to be at or near transcription start sites and encompassed more CTCF binding sites, while open chromatin sites found in only one cell type were typically located away from transcription start sites, and contained DNA motifs recognized by regulators of cell-type identity. As one example of its ability to identify functional DNA, we show that open chromatin regions bound by CTCF are potent insulators. We identified clusters of open regulatory elements (COREs) that were physically near each other and whose appearance was coordinated among one or more cell types. Gene expression and RNA Pol II binding data support the hypothesis that COREs control gene activity required for the maintenance of cell-type identity. This publicly available atlas of regulatory elements may prove valuable in identifying non-coding DNA sequence variants that are causally linked to human disease. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq, FAIRE-seq, and ChIP-seq were performed on seven human cell lines: GM12878 (lymphoblastoid), K562 (leukemia), HepG2 (hepatocellular carcinoma), HelaS3 (cervical carcinoma), HUVEC (human umbilical vein endothelial cells), NHEK (keratinocytes), and H1-ES (embryonic stem cells). For each cell line, two or three replicates were independently grown and split into three, one for each of the three experimental methods. Control ChIP experiments were performed on five of the cell lines with NHEK and H1-ES being excluded due to lack of material.
Project description:Using DNaseI hypersensitivity (HS) assays (Dnase-seq), high resolution DNaseI digestion profiles were generated genome-wide in diverse human cell types. We showed that within general regions of DNaseI HS that are known to identify locations of gene regulatory elements, DNaseI digestion patterns allowed us to identify locations of individual transcription factor binding sites that protected against the bound DNA against digestion. To measure the accuracy of these footprints, we also generated ChIP-seq data for the CTCF DNA binding factor in the same cell growths. We found that DNaseI footprints containing the CTCF canonical binding motif show significant ChIP-seq signal while CTCF binding motifs not in footprints show almost no signal providing one measure of valdation of the DNaseI footprints. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Six cell lines representing cervica carcinoma, chronic myeloid leukemia, human embryonic stem cells, lymphoblastoid cells, human epidermal keratinocytes and umbilical vein endothelial cells were analyzed using Dnase-seq.
Project description:The human body contains thousands of unique cell types, each with specialized functions. Cell identity is governed in large part by gene transcription programs, which are determined by regulatory elements encoded in DNA. To identify regulatory elements active in seven cell lines representative of diverse human cell types, we used DNase-seq and FAIRE-seq to map M-CM-"M-BM-^@M-BM-^\open chromatinM-CM-"M-BM-^@M-BM-^]. Over 870,000 DNaseI or FAIRE sites, which correspond largely to nucleosome depleted regions (NDRs), were identified across the seven cell lines, covering nearly 9% of the genome. The combination of DNaseI and FAIRE is more effective than either assay alone in identifying likely regulatory elements, as judged by coincidence with transcription factor binding locations determined in the same cells. Open chromatin common to all seven cell types tended to be at or near transcription start sites and encompassed more CTCF binding sites, while open chromatin sites found in only one cell type were typically located away from transcription start sites, and contained DNA motifs recognized by regulators of cell-type identity. As one example of its ability to identify functional DNA, we show that open chromatin regions bound by CTCF are potent insulators. We identified clusters of open regulatory elements (COREs) that were physically near each other and whose appearance was coordinated among one or more cell types. Gene expression and RNA Pol II binding data support the hypothesis that COREs control gene activity required for the maintenance of cell-type identity. This publicly available atlas of regulatory elements may prove valuable in identifying non-coding DNA sequence variants that are causally linked to human disease. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq, FAIRE-seq, and ChIP-seq were performed on seven human cell lines: GM12878 (lymphoblastoid), K562 (leukemia), HepG2 (hepatocellular carcinoma), HelaS3 (cervical carcinoma), HUVEC (human umbilical vein endothelial cells), NHEK (keratinocytes), and H1-ES (embryonic stem cells). For each cell line, two or three replicates were independently grown and split into three, one for each of the three experimental methods. Control ChIP experiments were performed on five of the cell lines with NHEK and H1-ES being excluded due to lack of material.
Project description:The human body contains thousands of unique cell types, each with specialized functions. Cell identity is governed in large part by gene transcription programs, which are determined by regulatory elements encoded in DNA. To identify regulatory elements active in seven cell lines representative of diverse human cell types, we used DNase-seq and FAIRE-seq to map M-CM-"M-BM-^@M-BM-^\open chromatinM-CM-"M-BM-^@M-BM-^]. Over 870,000 DNaseI or FAIRE sites, which correspond largely to nucleosome depleted regions (NDRs), were identified across the seven cell lines, covering nearly 9% of the genome. The combination of DNaseI and FAIRE is more effective than either assay alone in identifying likely regulatory elements, as judged by coincidence with transcription factor binding locations determined in the same cells. Open chromatin common to all seven cell types tended to be at or near transcription start sites and encompassed more CTCF binding sites, while open chromatin sites found in only one cell type were typically located away from transcription start sites, and contained DNA motifs recognized by regulators of cell-type identity. As one example of its ability to identify functional DNA, we show that open chromatin regions bound by CTCF are potent insulators. We identified clusters of open regulatory elements (COREs) that were physically near each other and whose appearance was coordinated among one or more cell types. Gene expression and RNA Pol II binding data support the hypothesis that COREs control gene activity required for the maintenance of cell-type identity. This publicly available atlas of regulatory elements may prove valuable in identifying non-coding DNA sequence variants that are causally linked to human disease. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq, FAIRE-seq, and ChIP-seq were performed on seven human cell lines: GM12878 (lymphoblastoid), K562 (leukemia), HepG2 (hepatocellular carcinoma), HelaS3 (cervical carcinoma), HUVEC (human umbilical vein endothelial cells), NHEK (keratinocytes), and H1-ES (embryonic stem cells). For each cell line, two or three replicates were independently grown and split into three, one for each of the three experimental methods. Control ChIP experiments were performed on five of the cell lines with NHEK and H1-ES being excluded due to lack of material.
Project description:Using DNaseI hypersensitivity (HS) assays (Dnase-seq), high resolution DNaseI digestion profiles were generated genome-wide in diverse human cell types. We showed that within general regions of DNaseI HS that are known to identify locations of gene regulatory elements, DNaseI digestion patterns allowed us to identify locations of individual transcription factor binding sites that protected against the bound DNA against digestion. To measure the accuracy of these footprints, we also generated ChIP-seq data for the CTCF DNA binding factor in the same cell growths. We found that DNaseI footprints containing the CTCF canonical binding motif show significant ChIP-seq signal while CTCF binding motifs not in footprints show almost no signal providing one measure of valdation of the DNaseI footprints. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Five cell lines representing cervica carcinoma, chronic myeloid leukemia, embryonic stem cells, epidermal keratinocytes, and umbilical vein endothelial cells were analyzed using ChIP-seq.
Project description:The epithelium lining the epididymis has a pivotal role in ensuring a luminal environment that can support normal sperm maturation. Many of the individual genes that encode proteins involved in establishing the epididymal luminal fluid are well characterized. They include ion channels, ion exchangers, transporters and solute carriers. However, the molecular mechanisms that coordinate expression of these genes and modulate their activities in response to biological stimuli are less well understood. To identify cis-regulatory elements for genes expressed in human epididymis epithelial cells we generated genome-wide maps of open chromatin by DNase-seq. This analysis identified 33,542 epididymis-selective DNase I hypersensitive sites (DHS), which were not evident in five cell types of different lineages. Identification of genes with epididymis-selective DHS at their promoters revealed gene pathways that are active in immature epididymis epithelial cells. These include processes correlating with epithelial function and also others with specific roles in the epididymis including retinol metabolism and ascorbate and aldarate metabolism. Peaks of epididymis-selective chromatin were seen in the androgen receptor gene and the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which has a critical role in regulating ion transport across the epididymis epithelium. In silico prediction of transcription factor binding sites that were over-represented in epididymis-selective DHS identified epithelial transcription factors including ELF5 and ELF3, the androgen receptor, Pax2 and Sox9, as components of epididymis transcriptional networks. Active genes, which are targets of each transcription factor, reveal important biological processes in the epididymis epithelium. To identify cis-regulatory elements for genes expressed in human epididymis epithelial cells we generated genome-wide maps of open chromatin by DNase-seq.
Project description:Distal cell-type-specific regulatory elements may be located at very large distances from the genes that they control and are often hidden within intergenic regions or in introns of other genes. The development of methods that enable mapping of regions of open chromatin genome wide has greatly advanced the identification and characterisation of these elements. Here we use DNase I hypersensitivity mapping followed by deep sequencing (DNase-seq) to generate a map of open chromatin in primary human tracheal epithelial (HTE) cells and use bioinformatic approaches to characterise the distribution of these sites within the genome and with respect to gene promoters, intronic and intergenic regions. Genes with THE-selective open chromatin at their promoters were associated with multiple pathways of epithelial function and differentiation. The data predict novel cell-type-specific regulatory elements for genes involved in HTE cell function, such as structural proteins and ion channels, and the transcription factors that may interact with them to control gene expression. Moreover, the map of open chromatin can identify the location of potentially critical regulatory elements in genome-wide association studies (GWAS) in which the strongest association is with single nucleotide polymorphisms in non-coding regions of the genome. We demonstrate its relevance to a recent GWAS that identifies modifiers of cystic fibrosis lung disease severity. Since HTE cells have many functional similarities with bronchial epithelial cells and other differentiated cells in the respiratory epithelium, these data are of direct relevance to elucidating the molecular basis of normal lung function and lung disease. To identify cis-regulatory elements for genes expressed in human tracheal epithelial cells we generated genome-wide maps of open chromatin by DNase-seq. HTE cells were isolated from these trachea and grown as described previously (Davis P.B. et al.1990).
Project description:Cellular diversity in a multicellular organism like the human is achieved in part by distinct programs of gene expression at the level of transcription, which in turn are mediated by transcription factors (TFs). However, there are few systematic studies of the genomic binding of different types of TFs across a wide range of human cell types, especially in relation to gene expression. Using chromatin Immunoprecipitation followed by thigh-throughput sequencing (ChIP-seq) we identified an average of 45,000, 30,000 and 8,000 binding sites for CTCF, Pol2 and MYC respectively across eleven cell types.CTCF preferred to bind to intergenic regions while Pol2 and MYC tended to bind to core promoter regions. CTCF sites were highly conserved across diverse cell types, whereas MYC showed the greatest cell-type specificity. MYC co-localized with Pol2 at many of their binding sites and putative target genes. Cell-type specific binding sites, in particular for MYC and Pol II, were associated with cell-type specific functions. Patterns of binding in relation to gene features were generally invariant across different cell types. Pol II occupancy was higher over exons than adjacent introns, likely reflecting a link between transcriptional elongation and splicing. TF binding was positively correlated with the expression levels of their putative target genes, but combinatorial binding, in particular of MYC and Pol II was even more strongly associated with higher gene expression. Our ChIP-seq data sheds considerable light on how these transcription factors of different types function individually and in combination with one another to occupy target genomic loci and shape gene expression programs in a cell-type specific manner. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Keywords: Genome binding/occupancy profiling by high throughput sequencing ChIP-seq were performed on eleven human cell lines: GM12878 (lymphoblastoid), K562 (leukemia), HepG2 (hepatocellular carcinoma), HelaS3 (cervical carcinoma), HUVEC (human umbilical vein endothelial cells), NHEK (keratinocytes), H1-ES (embryonic stem cells), MCF7 (breast adenocarcinoma), FB8470 (Normal child fibroblasts), FB0167P (Progeria fibroblast), and H54 (glioblastoma). For each cell line, two or three replicates were independently grown and split into three, one for each of the three experimental methods. Control ChIP experiments were performed on five of the cell lines with NHEK and H1-ES being excluded due to lack of material. Additional ChIP-Seq data mentioned in the 'overall design' but not included in this Series are contained in the GSE30226 SubSeries.