Reinforcement learning guided directed evolution of synthetic cis-regulatory elements
Ontology highlight
ABSTRACT: The design of synthetic cis-regulatory elements (CREs) offers a route to engineer context-specific transcriptional programs, while efficient optimization of functional synthetic CREs remains challenging. Here, we present GO-CRE (Guided Optimization of CRE), a reinforcement learning guided framework built on a hybrid Transformer-Mamba2 architecture for efficient iterative CRE design. Across K562, HepG2, and SK-N-SH cells, GO-CRE drives CRE generation along directional trajectories in a reconstructed sequence-grammar landscape, revealing a staged process of in silico directed evolution. We resolve this process into three phases—search, commitment, and optimization. Productive trajectories in K562 and HepG2 progress through all three phases and converge on optimized regulatory programs, whereas SK-N-SH trajectories become trapped in local sequence-grammar basins. In HepG2, tuning k-mer composition redirects evolution away from low-complexity traps toward functional motif programs that retain HNF, FOXA, and related regulatory features. Generated synthetic CREs retain sequence diversity while converging on lineage-aligned grammar features. Lentiviral massively parallel reporter assays validate target-selective activity in K562 and HepG2 and show HepG2-generated CREs exceed endogenous CREs on average, whereas the strongest sequences perform comparably. Together, these findings establish GO-CRE as an interpretable framework for RL-guided directed evolution of synthetic CREs.
ORGANISM(S): Homo sapiens
PROVIDER: GSE330487 | GEO | 2026/05/15
REPOSITORIES: GEO
ACCESS DATA