Computational characterization of chromatin domain boundary-associated genomic elements.
ABSTRACT: Topologically associated domains (TADs) are 3D genomic structures with high internal interactions that play important roles in genome compaction and gene regulation. Their genomic locations and their association with CCCTC-binding factor (CTCF)-binding sites and transcription start sites (TSSs) were recently reported. However, the relationship between TADs and other genomic elements has not been systematically evaluated. This was addressed in the present study, with a focus on the enrichment of these genomic elements and their ability to predict the TAD boundary region. We found that consensus CTCF-binding sites were strongly associated with TAD boundaries as well as with the transcription factors (TFs) Zinc finger protein (ZNF)143 and Yin Yang (YY)1. TAD boundary-associated genomic elements include DNase I-hypersensitive sites, H3K36 trimethylation, TSSs, RNA polymerase II, and TFs such as Specificity protein 1, ZNF274 and SIX homeobox 5. Computational modeling with these genomic elements suggests that they have distinct roles in TAD boundary formation. We propose a structural model of TAD boundaries based on these findings that provides a basis for studying the mechanism of chromatin structure formation and gene regulation.
Project description:BACKGROUND:Topologically associating domains (TADs) are genomic regions of self-interaction. Additionally, it is known that TAD boundaries are enriched in CTCF binding sites. In turn, CTCF sites are known to be asymmetric, whereby the convergent configuration of a pair of CTCF sites leads to the formation of a chromatin loop in vivo. However, to date, it has been unclear how to reconcile TAD structure with CTCF-based chromatin loops. RESULTS:We approach this problem by analysing CTCF binding site strengths and classifying clusters of CTCF sites along the genome on the basis of their relative orientation. Analysis of CTCF site orientation classes as a function of their spatial distribution along the human genome reveals that convergent CTCF site clusters are depleted while divergent CTCF clusters are enriched in the 5- to 100-kb range. We then analyse the distribution of CTCF binding sites as a function of TAD boundary conservation across seven primary human blood cell types. This reveals divergent CTCF site enrichment at TAD boundaries. Furthermore, convergent arrays of CTCF sites separate the left and right sections of TADs that harbour internal CTCF sites, resulting in unequal TAD 'halves'. CONCLUSIONS:The orientation-based CTCF binding site cluster classification that we present reconciles TAD boundaries and CTCF site clusters in a mechanistically elegant fashion. This model suggests that the emergent structure of nuclear chromatin in the form of TADs relies on the obligate alternation of divergent and convergent CTCF site clusters that occur at different length scales along the genome.
Project description:The genome is organized via CTCF-cohesin-binding sites, which partition chromosomes into 1-5?megabase (Mb) topologically associated domains (TADs), and further into smaller sub-domains (sub-TADs). Here we examined in vivo an ?80?kb sub-TAD, containing the mouse ?-globin gene cluster, lying within a ?1?Mb TAD. We find that the sub-TAD is flanked by predominantly convergent CTCF-cohesin sites that are ubiquitously bound by CTCF but only interact during erythropoiesis, defining a self-interacting erythroid compartment. Whereas the ?-globin regulatory elements normally act solely on promoters downstream of the enhancers, removal of a conserved upstream CTCF-cohesin boundary extends the sub-TAD to adjacent upstream CTCF-cohesin-binding sites. The ?-globin enhancers now interact with the flanking chromatin, upregulating expression of genes within this extended sub-TAD. Rather than acting solely as a barrier to chromatin modification, CTCF-cohesin boundaries in this sub-TAD delimit the region of chromatin to which enhancers have access and within which they interact with receptive promoters.
Project description:The potential impact of structural variants includes not only the duplication or deletion of coding sequences, but also the perturbation of noncoding DNA regulatory elements and structural chromatin features, including topological domains (TADs). Structural variants disrupting TAD boundaries have been implicated both in cancer and developmental disease; this likely occurs via "enhancer hijacking," whereby removal of the TAD boundary exposes enhancers to new target transcription start sites (TSSs). With this functional role, we hypothesized that boundaries would display evidence for negative selection. Here we demonstrate that the chromatin landscape constrains structural variation both within healthy humans and across primate evolution. In contrast, in patients with developmental delay, variants occur remarkably uniformly across genomic features, suggesting a potentially broad role for enhancer hijacking in human disease.
Project description:The genome is organized into repeating topologically associated domains (TADs), each of which is spatially isolated from its neighbor by poorly understood boundary elements thought to be conserved across cell types. Here, we show that deletion of CTCF (CCCTC-binding factor)-binding sites at TAD and sub-TAD topological boundaries that form within the HoxA and HoxC clusters during differentiation not only disturbs local chromatin domain organization and regulatory interactions but also results in homeotic transformations typical of Hox gene misregulation. Moreover, our data suggest that CTCF-dependent boundary function can be modulated by competing forces, such as the self-assembly of polycomb domains within the nucleus. Therefore, CTCF boundaries are not merely static structural components of the genome but instead are locally dynamic regulatory structures that control gene expression during development.
Project description:The metazoan genome is compartmentalized in areas of highly interacting chromatin known as topologically associating domains (TADs). TADs are demarcated by boundaries mostly conserved across cell types and even across species. However, a genome-wide characterization of TAD boundary strength in mammals is still lacking. In this study, we first use fused two-dimensional lasso as a machine learning method to improve Hi-C contact matrix reproducibility, and, subsequently, we categorize TAD boundaries based on their insulation score. We demonstrate that higher TAD boundary insulation scores are associated with elevated CTCF levels and that they may differ across cell types. Intriguingly, we observe that super-enhancers are preferentially insulated by strong boundaries. Furthermore, we demonstrate that strong TAD boundaries and super-enhancer elements are frequently co-duplicated in cancer patients. Taken together, our findings suggest that super-enhancers insulated by strong TAD boundaries may be exploited, as a functional unit, by cancer cells to promote oncogenesis.
Project description:The spatial organization of chromatin is pivotal for regulating genome functions. We report an imaging method for tracing chromatin organization with kilobase- and nanometer-scale resolution, unveiling chromatin conformation across topologically associating domains (TADs) in thousands of individual cells. Our imaging data revealed TAD-like structures with globular conformation and sharp domain boundaries in single cells. The boundaries varied from cell to cell, occurring with nonzero probabilities at all genomic positions but preferentially at CCCTC-binding factor (CTCF)- and cohesin-binding sites. Notably, cohesin depletion, which abolished TADs at the population-average level, did not diminish TAD-like structures in single cells but eliminated preferential domain boundary positions. Moreover, we observed widespread, cooperative, multiway chromatin interactions, which remained after cohesin depletion. These results provide critical insight into the mechanisms underlying chromatin domain and hub formation.
Project description:Topologically associating domains (TADs) are fundamental structural and functional building blocks of human interphase chromosomes, yet the mechanisms of TAD formation remain unclear. Here, we propose that loop extrusion underlies TAD formation. In this process, cis-acting loop-extruding factors, likely cohesins, form progressively larger loops but stall at TAD boundaries due to interactions with boundary proteins, including CTCF. Using polymer simulations, we show that this model produces TADs and finer-scale features of Hi-C data. Each TAD emerges from multiple loops dynamically formed through extrusion, contrary to typical illustrations of single static loops. Loop extrusion both explains diverse experimental observations-including the preferential orientation of CTCF motifs, enrichments of architectural proteins at TAD boundaries, and boundary deletion experiments-and makes specific predictions for the depletion of CTCF versus cohesin. Finally, loop extrusion has potentially far-ranging consequences for processes such as enhancer-promoter interactions, orientation-specific chromosomal looping, and compaction of mitotic chromosomes.
Project description:The architectural protein CTCF is a mediator of chromatin conformation, but how CTCF binding to DNA is orchestrated to maintain long-range gene expression is poorly understood. Here we perform RNAi knockdown to reduce CTCF levels and reveal a shared subset of CTCF-bound sites are robustly resistant to protein depletion. The 'persistent' CTCF sites are enriched at domain boundaries and chromatin loops constitutive to all cell types. CRISPR-Cas9 deletion of 2 persistent CTCF sites at the boundary between a long-range epigenetically active (LREA) and silenced (LRES) region, within the Kallikrein (KLK) locus, results in concordant activation of all 8 KLK genes within the LRES region. CTCF genome-wide depletion results in alteration in Topologically Associating Domain (TAD) structure, including the merging of TADs, whereas TAD boundaries are not altered where persistent sites are maintained. We propose that the subset of essential CTCF sites are involved in cell-type constitutive, higher order chromatin architecture.
Project description:CTCF (CCCTC-binding factor) is a transcription regulator with hundreds of binding sites in the human genome. It has a main function as an insulator protein, defining together with cohesins the boundaries of areas of the genome called topologically associating domains (TADs). TADs contain regulatory elements such as enhancers which function as regulators of the transcription of genes inside the boundaries of the TAD while they are restricted from regulating genes outside these boundaries. This paper will examine the most common genetic lesions of CTCF as well as its related protein CTCFL (CTCF-like also called BORIS) in cancer using publicly available data from published genomic studies. Cancer types where abnormalities in the two genes are more common will be examined for possible associations with underlying repair defects or other prevalent genetic lesions. The putative functional effects in CTCF and CTCFL lesions will also be explored.
Project description:BackgroundCTCF binding contributes to the establishment of a higher-order genome structure by demarcating the boundaries of large-scale topologically associating domains (TADs). However, despite the importance and conservation of TADs, the role of CTCF binding in their evolution and stability remains elusive.ResultsWe carry out an experimental and computational study that exploits the natural genetic variation across five closely related species to assess how CTCF binding patterns stably fixed by evolution in each species contribute to the establishment and evolutionary dynamics of TAD boundaries. We perform CTCF ChIP-seq in multiple mouse species to create genome-wide binding profiles and associate them with TAD boundaries. Our analyses reveal that CTCF binding is maintained at TAD boundaries by a balance of selective constraints and dynamic evolutionary processes. Regardless of their conservation across species, CTCF binding sites at TAD boundaries are subject to stronger sequence and functional constraints compared to other CTCF sites. TAD boundaries frequently harbor dynamically evolving clusters containing both evolutionarily old and young CTCF sites as a result of the repeated acquisition of new species-specific sites close to conserved ones. The overwhelming majority of clustered CTCF sites colocalize with cohesin and are significantly closer to gene transcription start sites than nonclustered CTCF sites, suggesting that CTCF clusters particularly contribute to cohesin stabilization and transcriptional regulation.ConclusionsDynamic conservation of CTCF site clusters is an apparently important feature of CTCF binding evolution that is critical to the functional stability of a higher-order chromatin structure.