Project description:Single-cell mapping of chromosomal accessibility patterns has recently led to improved predictive modelling of epigenomic activity from sequence. However, quantitative models explaining the epigenome using directly interpretable components are still lacking. Here we develop IceQream (IQ), a modelling strategy and inference algorithm for regressing accessibility from sequences using physical models of transcription factor (TF) binding. IQ uses spatial integration of sequences over a range of TF-DNA affinities and localization relative to the target locus. It infers TF effective concentrations as latent variables that activate or repress regulatory elements in a non-linear fashion. These are supplemented with synergistic and antagonistic pairwise interactions between TFs. Analysis of both human and mouse data shows that IQ derives similar, and in some cases, better performance compared to state-of-the-art deep neural network models. IQ provides an essential mechanistic and explicable baseline for further developments toward understanding gene and genome regulation from sequence.
Project description:Purpose: The goal of this study is to use chromatin accessibility data to study the information patterns of chromatin accessibility that encode TF-chromatin interactions. Methods: We generated chromatin accessibility data from the GM12878 lymphoblastoid cell line using a modified protocol where the ATAC-seq fragments are sonicated during the library preparation. The goal of the sonication step is to disrupt the observed fragment size distribution, therefore masking local TF-chromatin interactions encoded in the larger fragment sizes (e.g. nucleosome phasing) without affecting the levels of chromatin accessibility.
Project description:This dataset uses DNase-seq to profile the genome-wide DNase I hypersensitivity of mES and mES-derived cells along an early pancreatic lineage and provides the locations of putative Transcription Factor (TF) binding sites using the PIQ algorithm. DNase-seq takes advantage of the preferential cutting of DNase I in open chromatin and steric blockage of of DNase I by tightly bound TFs that protect associated genomic DNA sequences. After deep sequencing of DNase IM-bM-^@M-^Sdigested genomic DNA from intact nuclei, genome-wide data on chromatin accessibility as well as TF-specific DNase I protection profiles that reveal the genomic binding locations of a majority of TFs are obtained. Such TF signature M-bM-^@M-^XDNase profilesM-bM-^@M-^Y reflect the effect of the TF on DNA shape and local chromatin architecture, extending hundreds of base pairs from a TF binding site, and these profiles are centered on M-bM-^@M-^XDNase footprintsM-bM-^@M-^Y at the binding motif itself, which reflects the biophysics of protein-DNA binding. An algorithm, PIQ, is then applied that models the specific profile of each factor, and in combination with sequence information predicts the likely binding locations of over 700 TFs genome wide. This dataset includes DNase-seq hypersensitivity data from 6 mES-derived cell types: mESC, Mesendoderm, Mesoderm, Endoderm, Intestinal Endoderm, and Prepancreatic Endoderm. For each cell type, TF binding site predictions are made based on the identification TF-specific DNase-seq profiles over any of 1331 possible binding motifs. After significance thesholding, genome-wide binding site predictions for <700 TFs are included.
Project description:The huge size, the redundancy and the great repeated portion of the bread wheat genome [Triticum aestivum (L.)], placed it among the most difficult species to be sequenced and dissected at the genetic, structural and evolutionary levels. To overcome the limitations, a strategy based on the genome compartmentalization in individual chromosomes and the subsequent production of physical maps was established within the frame of the International Wheat Genome Sequence Consortium. A total of 95,812 BAC clones of short (5AS) and long (5AL) arm-specific BAC libraries, were fingerprinted and assembled into contigs by complementary analytical approaches based on FingerPrinted Contigs and Linear Topological Contig. Combined anchoring approaches based on PCR marker screening, microarray and BlastN searches, applied to interlinked genomic tools, that is genetic maps, deletion bin map, high-density neighbor map, BAC end sequences, genome zipper and chromosome survey sequences, allowed the development of a high quality physical map, with an anchored physical coverage of 75% for 5AS and 53% for 5AL, with high portions (64 and 48%, respectively) ordered along the chromosome. The gene distribution along the wheat chromosome 5A compared with the closest related genomes showed a pattern of syntenic blocks belonging to different chromosomes of Brachypodium, rice and sorghum and regions involving translocations and inversions. The physical map presented here is currently the most comprehensive map for 5A chromosome and represents an essential resource for fine genetic mapping and map-based cloning of agronomically relevant traits, and a reference for the 5A sequencing projects. 55 DNA pools of short arm of chromsome 5A and 63 DNA pools of long arm of 5A. The DNAs derive from BAC clones of the Minimal Tiling Paths produced by physical assemly of BAC fingerprints.
Project description:Animal and human studies suggest that inflammation is associated with behavioral disorders including aggression. We have recently shown that physical aggression of boys during childhood is strongly associated with reduced plasma levels of cytokines IL-1a, IL-4, IL-6, IL-8 and IL-10, later in early adulthood (Provencal et al., unpublished data). This study tests the hypothesis that there is an association between differential DNA methylation regions in cytokine genes in T cells and monocytes DNA in adult subjects and a trajectory of physical aggression from childhood to adolescence. We compared the methylation profiles of the entire genomic loci encompassing the IL-1a, IL-6, IL-4, IL10 and IL8 and three of their regulatory transcription factors (TF) NFkB1, NFAT5 and STAT6 genes in adult males on a chronic physical aggression trajectory (CPA) and males with the same background who followed a normal physical aggression trajectory (control group). We used the method of methylated DNA immunoprecipitation (MeDIP) with comprehensive cytokine gene loci and TF loci microarray hybridization, statistical analysis and false discovery rate correction. We recruited two groups of Caucasian males who were born in families with a low socioeconomic status and were living at the time of the present study within 200km from our laboratory. The first group, composed of 8 subjects, had a history of chronic physical aggression from age 6 to 15 years (chronic physical aggression group, CPA). The second group, composed of 12 subjects, was recruited from the same longitudinal studies but included only those who did not have a history of chronic physical aggression from age 6 to 15 (Control group, CG). Using custom-designed microarrays with 44K probes tiling cytokines IL-1a, IL-6, IL-4, IL10 and IL8 and three of their regulatory transcription factors (TF) NFkB1, NFAT5 and STAT6, we obtained DNA methylation profiles by meDIP-chip. Each profile was generated in triplicate.
Project description:The huge size, the redundancy and the great repeated portion of the bread wheat genome [Triticum aestivum (L.)], placed it among the most difficult species to be sequenced and dissected at the genetic, structural and evolutionary levels. To overcome the limitations, a strategy based on the genome compartmentalization in individual chromosomes and the subsequent production of physical maps was established within the frame of the International Wheat Genome Sequence Consortium. A total of 95,812 BAC clones of short (5AS) and long (5AL) arm-specific BAC libraries, were fingerprinted and assembled into contigs by complementary analytical approaches based on FingerPrinted Contigs and Linear Topological Contig. Combined anchoring approaches based on PCR marker screening, microarray and BlastN searches, applied to interlinked genomic tools, that is genetic maps, deletion bin map, high-density neighbor map, BAC end sequences, genome zipper and chromosome survey sequences, allowed the development of a high quality physical map, with an anchored physical coverage of 75% for 5AS and 53% for 5AL, with high portions (64 and 48%, respectively) ordered along the chromosome. The gene distribution along the wheat chromosome 5A compared with the closest related genomes showed a pattern of syntenic blocks belonging to different chromosomes of Brachypodium, rice and sorghum and regions involving translocations and inversions. The physical map presented here is currently the most comprehensive map for 5A chromosome and represents an essential resource for fine genetic mapping and map-based cloning of agronomically relevant traits, and a reference for the 5A sequencing projects.
Project description:Epigenetic mechanisms govern the transcriptional activity of lineage-specifying enhancers; but recent work challenges the dogma that both chromatin accessibility and DNA hypomethylation are prerequisites for transcription, highlighting a need to understand their coordinated dynamics. We established a highly-resolved timeline of DNA demethylation, chromatin accessibility, and transcription factor (TF) occupancy during early human cell differentiation. We show >30,000 lineage-specifying enhancers undergo rapid and transient accessibility changes associated with distinct periods of TF expression. By contrast, enhancer DNA methylation changes are prolonged, unidirectional and delayed relative to chromatin dynamics, creating discordant epigenetic states. Using 6-base sequencing to detect methyl-intermediate, 5hmC, revealed that, for a subset of enhancers, TET-mediated, active demethylation begins prior to, and is maintained independently of, TF binding. In fact, machine learning models trained on 5-hydroxymethylation can predict of future chromatin states. Complete demethylation persists long after TF binding and accessibility have dissipated, suggesting that long-lasting hypomethylation of certain enhancers is a historical record of previous activity.