Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

ABSTRACT: Precise, predictable genome integrations by deep learning-assisted design of microhomology-based templates

PROVIDER: PRJNA1282594 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Json Xml

Similar Datasets

Project description:Protein-Nucleic Acid Constrained Language Model Assisted Design of Precise

| PRJNA1291838 | ENA

Project description:Machine Learning based antimicrobial peptide design

| PRJNA1243451 | ENA

Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair

Project description:Non-homologous end-joining (NHEJ) plays an important role in double-strand break (DSB) repair of DNA. Recent studies have shown that the error patterns of NHEJ are strongly biased by sequence context, but these studies were based on relatively few templates. To investigate this more thoroughly, we systematically profiled ~1.16 million independent mutational events resulting from CRISPR/Cas9-mediated cleavage and NHEJ-mediated DSB repair of 6,872 synthetic target sequences, introduced into a human cell line via lentiviral infection. We find that: 1) insertions are dominated by 1 bp events templated by sequence immediately upstream of the cleavage site, 2) deletions are predominantly associated with microhomology, and 3) targets exhibit variable but reproducible diversity with respect to the number and relative frequency of the mutational outcomes to which they give rise. From these data, we trained a model (Lindel) that uses local sequence context to predict the distribution of mutational outcomes. Exploiting the bias of NHEJ outcomes towards microhomology mediated events, we demonstrate the programming of deletion patterns by introducing microhomology to specific locations in the vicinity of the DSB site. We anticipate that our results will inform investigations of DSB repair mechanisms as well as the design of CRISPR/Cas9 experiments for diverse applications including genome-wide screens, gene therapy, lineage tracing and molecular recording.

2019-06-07 | GSE131421 | GEO

Design and deep learning of synthetic B- cell-specific promoters

Project description:Design and deep learning of synthetic B- cell-specific promoters

| PRJNA971148 | ENA

Programmatic design and editing of cis-regulatory elements

Project description:The development of modern genome editing and DNA synthesis has enabled researchers to edit DNA sequences with high precision but has left unsolved the problem of designing these edits. We introduce Ledidi, a computational method that rephrases the discrete design task of choosing which edits to make as an easily solvable continuous optimization problem. Ledidi can use any pre-trained deep learning model to guide the optimization, yielding an edited sequence that exhibits the desired outcome while explicitly minimizing the number of edits. When applied in dozens of settings, we find that Ledidi's designs can precisely control transcription factor binding, chromatin accessibility, transcription, and enhancer activity in silico. By using several deep learning models simultaneously, we design cell type-specific enhancers and experimentally validate them in cellulo. Finally, we introduce the concept of an "affinity catalog'', where the design task is repeated multiple times across continuous variants of the design target. We demonstrate how these catalogs can be used to interpret deep learning models and the impact of starting template sequences, and also to design regulatory elements that control transcriptional dosage in a cell type-specific fashion.

2025-12-05 | GSE312234 | GEO

Project description:Protein-Nucleic Acid constrained Language Model-assisted Design of Precise and Compact Adenine Base Editor

| PRJNA1155667 | ENA

Easy-Prime: a machine learning–based prime editor design tool

Project description:Easy-Prime: a machine learning–based prime editor design tool

| PRJNA734350 | ENA

G-quadruplex profiling in complex tissues using single-cell CUT&Tag

Project description:G-quadruplexes (G4) are non-canonical DNA structures that gained increasing attention for their potential roles in gene regulation, with implications in neurodegenerative diseases and cancer. Despite their biological significance, G4 structures have not been studied systematically across tissues and cell types. In this study, we employ G4 single-cell CUT&Tag (G4 scCUT&Tag) to characterize G4 landscapes in postnatal mouse brain cells, leveraging single-cell analytical approaches commonly used in scRNA-Seq and scATAC-Seq datasets. Using conventional single-cell omics workflows to process and explore our data, we distinguished different cell types based on G4 heterogeneity. Furthermore, we performed uncoupled multi-omics integration of G4 scCUT&Tag data with scRNA-Seq gene expression profiles, using both a covariance-based technique (canonical correlation analysis) and a transfer learning-based deep learning approach. These integrations not only revealed significant co-enrichment of G4 and gene expression signals, but demonstrated that G4 scCUT&Tag enables detailed examination of G4 heterogeneity in complex tissues and supports integrative analysis of G4 profiles with other omics layers, offering new insights into the epigenomic landscapes of the developing central nervous system.

2025-04-03 | GSE291468 | GEO

Gene expression profile Predictor on chemical Structures (GPS): Deep Learning-based platform to screen and design novel therapeutics

Project description:Gene expression profile Predictor on chemical Structures (GPS): Deep Learning-based platform to screen and design novel therapeutics

| PRJNA1235680 | ENA

Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray (22k Linearity)

Project description:E12.5 mouse whole embryo and E12.5 placenta total RNA were pooled to create 25:75, 50:50, and 75:25 ratio mixtures, based on Bioanalyzer quantitation. These samples, along with the original unmixed RNAs, were used as templates for duplicate linear amplification labeling reactions. cRNA target mixtures were hybridized against a Universal Mouse Reference (Stratagene). Pairwise comparison using the NIA Microarray Analysis (ANOVA) software produced log ratios, which were compared to the expected log ratios for genes showing statistically significant (FDR<0.05) differential expression between unmixed embryo and placenta. Keywords: cell type comparison design,development or differentiation design,normalization testing design,reference design

2005-10-28 | GSE3508 | GEO

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data