Project description:Sequence-based deep learning models have become the state of the art for the analysis of the genomic regulatory code. Particularly for transcriptional enhancers, deep learning models excel at deciphering sequence features and grammar that underlie their spatiotemporal activity. To enable end-to-end enhancer modeling and design, we developed a software and modeling package, called CREsted. It combines preprocessing starting from single-cell ATAC-seq data; modeling with a choice of several architectures for training classification and regression models on either topics or pseudobulk peak heights; sequence design using multiple strategies; and downstream analysis through a collection of tools to locate transcription factor (TF) binding sites, infer the effect of a TF (activating or repressing) on enhancer accessibility, decipher enhancer grammar, and score gene loci. We demonstrate CREsted using a mouse cortex model that we validate using the BICCN collection of in vivo validated mouse brain enhancers. Classical enhancers in immune cells, including the IFN-β enhanceosome are revisited using a PBMC model, and we assess the accuracy of TF binding site predictions with ChIP-seq. Additionally, we use CREsted to compare mesenchymal-like cancer cell states between tumor types; and we investigate different fine-tuning strategies of Borzoi within CREsted, comparing their performance and explainability with CREsted models trained from scratch. Finally, we train a CREsted model on a scATAC-seq atlas of zebrafish development, and use this to design and in vivo validate cell type-specific synthetic enhancers in 3 tissues. For varying datasets we demonstrate that CREsted facilitates efficient training and analyses, enabling scrutinization of the enhancer logic and design of synthetic enhancers across tissues and species. CREsted is available at https://crested.readthedocs.io.
Project description:We screened the genome of OCI-Ly-7, a Diffuse Large B-cell Lymphoma cell line, for genes mediating the response to rituximab treatment. We utilized the genome-wide library GeCKO v2 (A+B) from Feng Zhang. (Addgene#1000000049)
Project description:Genome-wide CRISPR screens were performed in Huh-7 hepatoma cell line to identify genes regulating ferroptosis sensitivity. Huh-7 cells were transduced with Human GeCKO library and selected by RSL-3.
Project description:The draft genome of L. sativa (lettuce) cv. Tizian was sequenced in two Illumina sequencing runs, mate pair and shotgun. This entry contains the RAW sequencing data.
Project description:L. helveticus is used to modulate cheese flavor and as a starter organism in certain cheese varieties. Our group has compiled a draft (4x) sequence for the 2.4 Mb genome of an industrial strain L. helveticus CNRZ32. The primary aim was to investigate expression of 168 completely sequenced genes during growth in milk and MRS medium using microarrays. Oligonucleotide probes against each of the completely sequenced genes were compiled on maskless photolithography-based DNA microarrays. Additionally, the entire draft genome sequence was used to produce tiled microarrays where the non-interrupted sequence contigs were covered by consecutive 24-mer probes. Keywords: growth conditions response
Project description:Since CNVs play a vital role in genomic studies, it is an imperative need to develop a comprehensive, more accurate and higher resolution porcine CNV map with practical significance in follow-up CNV functional analyses To detect CNV of pigs, we performed high density aCGH data of diverse pig breeds in the framework of the pig draft genome sequence (Sscrofa10.2)