Project description:The Bxb1 bacteriophage serine DNA recombinase is an efficient tool for engineering recombinant DNA into the genomes of cultured cells. Generally, a single engineered “landing pad” site is introduced into the cell genome, permitting the integration of transgenic circuits or libraries of transgene variants. While sufficient for many studies, the extent of genetic manipulation possible with a single recombinase site is limiting, and insufficient for more complex cell-based assays for protein function. Here, we harnessed two orthogonal Bxb1 recombinase sites to enable new avenues for mammalian synthetic biology. By designing plasmids with two recombinase sites, we demonstrate that we can avoid genomic integration of undesirable bacterial DNA elements. We also created “double landing pad” cells simultaneously harboring two orthogonal Bxb1 recombinase sites. These cells allow transgenic protein variant libraries to be readily paired with assay-specific protein partners or biosensors, opening up new functional readouts for large-scale functional assays.
Project description:Adenine base editors (ABEs) with wide CRISPR compatibility and high activity improves the editing efficiency and arouses the off-target challenges as well. Here, we carried out a comprehensive evaluation of ABE8e and ABE9 induced DNA and RNA mutations in model organism rice. The whole-genome sequencing analysis on plants with rBE46b (SpCas9n-TadA8e), rBE49b (SpCas9n-TadA9), rBE50 (SpCas9n-NG-TadA8e), rBE53 (SpCas9n-NG-TadA9) reveals that the ABEs with TadA9 lead to a higher number of off-target A>G SNVs and ABEs with SpCas9n-NG lead to a higher total number of off-target SNVs. The analysis of T-DNAs (ABEs carrier) disclosed that the on-target mutations could happen before T-DNA integration to plant genomes as well as after T-DNA integration to plant genomes, while ABEs integrated into plant genomes lead to more A>G SNVs. We also detected off-target A>G RNA mutations in plants with higher expression of ABEs but not in plants with lower expression of ABEs. The off-target A>G RNA mutations tend to cluster while off-target A>G DNA mutations cluster in a very rare manner. The findings that CRISPRs, TadA variants, T-DNA integration, and ABE expression contribute ABEs’ specificity provide alternative ways to increase the specificity of ABEs.
Project description:Adenine base editors (ABEs) with wide CRISPR compatibility and high activity improves the editing efficiency and arouses the off-target challenges as well. Here, we carried out a comprehensive evaluation of ABE8e and ABE9 induced DNA and RNA mutations in model organism rice. The whole-genome sequencing analysis on plants with rBE46b (SpCas9n-TadA8e), rBE49b (SpCas9n-TadA9), rBE50 (SpCas9n-NG-TadA8e), rBE53 (SpCas9n-NG-TadA9) reveals that the ABEs with TadA9 lead to a higher number of off-target A>G SNVs and ABEs with SpCas9n-NG lead to a higher total number of off-target SNVs. The analysis of T-DNAs (ABEs carrier) disclosed that the on-target mutations could happen before T-DNA integration to plant genomes as well as after T-DNA integration to plant genomes, while ABEs integrated into plant genomes lead to more A>G SNVs. We also detected off-target A>G RNA mutations in plants with higher expression of ABEs but not in plants with lower expression of ABEs. The off-target A>G RNA mutations tend to cluster while off-target A>G DNA mutations cluster in a very rare manner. The findings that CRISPRs, TadA variants, T-DNA integration, and ABE expression contribute ABEs’ specificity provide alternative ways to increase the specificity of ABEs.
Project description:Methylation is a repressive modification of DNA prevalent throughout mammalian genomes yet mostly absent at CG rich stretches referred to as CGI. Here we identify their building principles by parallel genomic targeting of sequence libraries. Iterative insertions generated over 3,000 variants of genome-derived and artificial sequences at the same genomic site. Single molecule profiling of the methylation status of this collection allowed modeling the contribution of CG content and DNA binding factors towards the unmethylated state. It made the surprising prediction that the majority of CGs within endogenous islands are susceptible to methylation changes modulated by the presence of transcription factors, which is indeed confirmed by genome-wide methylation dynamics during multiple cellular differentiations. Our model further predicts blocks of constitutively unmethylated CGs independent from TF binding, which have a median size of ~300bp but are only present in half of all islands. Their constitutively unmethylated state is a hallmark of untransformed cells but their increased methylation is a specific and predictive feature of cancer. This study quantifies the two principal mechanisms governing methylation patterns in mammalian genomes. It provides a framework to interpret methylation data across normal and cancer samples and refines the concept of CpG islands. Methylation is a repressive modification of DNA prevalent throughout mammalian genomes yet mostly absent at CG rich stretches referred to as CGI. Here we identify their building principles by parallel genomic targeting of sequence libraries. Iterative insertions generated over 3,000 variants of genome-derived and artificial sequences at the same genomic site. Single molecule profiling of the methylation status of this collection allowed modeling the contribution of CG content and DNA binding factors towards the unmethylated state. It made the surprising prediction that the majority of CGs within endogenous islands are susceptible to methylation changes modulated by the presence of transcription factors, which is indeed confirmed by genome-wide methylation dynamics during multiple cellular differentiations. Our model further predicts blocks of constitutively unmethylated CGs independent from TF binding, which have a median size of ~300bp but are only present in half of all islands. Their constitutively unmethylated state is a hallmark of untransformed cells but their increased methylation is a specific and predictive feature of cancer. This study quantifies the two principal mechanisms governing methylation patterns in mammalian genomes. It provides a framework to interpret methylation data across normal and cancer samples and refines the concept of CpG islands. Libraries of DNA sequences were constructed either by mouse genome (129S6) or E.coli genome (NC_010473.1) subrepresentation or custom synthesis. DNA fragments were inserted into the genome of mouse embryonic stem cells by recombination mediated casette exchange (RMCE) at the B-globin locus. Methylation status of the inserted DNA sequences was profiled by bisulfite sequencing using a pair of universal primers flanking the fragments.
Project description:Oncogenic human papillomavirus (HPV) genomes are often integrated into host chromosomes in HPV-associated cancers. HPV genomes are integrated either as a single copy, or as tandem repeats of viral DNA interspersed with, or without, host DNA. Integration occurs frequently in common fragile sites susceptible to tandem repeat formation, and the flanking or interspersed host DNA often contains transcriptional enhancer elements. When co-amplified with the viral genome, these enhancers can form super-enhancer-like elements that drive high viral oncogene expression. Here, we compiled highly curated datasets of HPV integration sites in cervical (CESC) and head and neck squamous cell carcinoma (HNSCC) cancers and assessed the number of breakpoints, viral transcriptional activity, and host genome copy number at each insertion site. Tumors frequently contained multiple distinct HPV integration sites, but often only one “driver” site that expressed viral RNA. Since common fragile sites and active enhancer elements are cell-type specific, we mapped these regions in cervical cell lines using FANCD2 and Brd4/H3K27ac ChIP-seq, respectively. Large enhancer clusters, or super-enhancers, were also defined using the Brd4/H3K27ac ChIP-seq dataset. HPV integration breakpoints were enriched at both FANCD2-associated fragile sites, and enhancer-rich regions, and frequently showed adjacent focal DNA amplification in CESC samples. We identified recurrent integration “hotspots” that were enriched for super-enhancers, some of which function as regulatory hubs for cell-identity genes. We propose that during persistent infection, extrachromosomal HPV minichromosomes associate with these transcriptional epicenters, and accidental integration could promote viral oncogene expression and carcinogenesis.
Project description:Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that a logic of cis-acting DNA sequence features can predict the majority of chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Cooperative Chromatin Model (CCM), that is capable of predicting a large fraction of genome-widepromoters chromatin accessibility at basepair-resolution in a range of human and mouse cell types from DNA sequence alone. We confirm that a CCM accurately predicts chromatin accessibility, even of a vast array of synthetic DNA sequences, with a novel CrispR-based method of highly efficient site-specific DNA library integration. CCMs are directly interpretable and reveal that a logic based on local, non-specific cooperation, largely among pioneer TFs, is sufficient to predict a large fraction of cellular chromatin accessibility in a wide variety of cell types.
Project description:Colonization of genomes by a new selfish genetic element is detrimental to the host species and must lead to an efficient, repressive response. In vertebrates as well as in Drosophila, piRNAs repress transposons in the germ line while endogenous siRNAs take on this role in somatic cells. For endo-siRNAs as well as for piRNAs, it is unclear how an efficient response can be initiated de novo. Our experiments establish that the endo-siRNA pathway will target artificially introduced sequences without the need for a pre-existing template in the genome. This response is also triggered in transiently transfected cells, thus genomic integration is not essential. Deep sequencing revealed that corresponding endo-siRNAs are generated throughout the sequence, but preferentially from transcribed regions.
Project description:Colonization of genomes by a new selfish genetic element is detrimental to the host species and must lead to an efficient, repressive response. In vertebrates as well as in Drosophila, piRNAs repress transposons in the germ line while endogenous siRNAs take on this role in somatic cells. For endo-siRNAs as well as for piRNAs, it is unclear how an efficient response can be initiated de novo. Our experiments establish that the endo-siRNA pathway will target artificially introduced sequences without the need for a pre-existing template in the genome. This response is also triggered in transiently transfected cells, thus genomic integration is not essential. Deep sequencing revealed that corresponding endo-siRNAs are generated throughout the sequence, but preferentially from transcribed regions. Examination of 3 different cell lines.