Unknown

Dataset Information

0

Deep flanking sequence engineering for efficient promoter design using DeepSEED.


ABSTRACT: Designing promoters with desirable properties is essential in synthetic biology. Human experts are skilled at identifying strong explicit patterns in small samples, while deep learning models excel at detecting implicit weak patterns in large datasets. Biologists have described the sequence patterns of promoters via transcription factor binding sites (TFBSs). However, the flanking sequences of cis-regulatory elements, have long been overlooked and often arbitrarily decided in promoter design. To address this limitation, we introduce DeepSEED, an AI-aided framework that efficiently designs synthetic promoters by combining expert knowledge with deep learning techniques. DeepSEED has demonstrated success in improving the properties of Escherichia coli constitutive, IPTG-inducible, and mammalian cell doxycycline (Dox)-inducible promoters. Furthermore, our results show that DeepSEED captures the implicit features in flanking sequences, such as k-mer frequencies and DNA shape features, which are crucial for determining promoter properties.

SUBMITTER: Zhang P 

PROVIDER: S-EPMC10562447 | biostudies-literature | 2023 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deep flanking sequence engineering for efficient promoter design using DeepSEED.

Zhang Pengcheng P   Wang Haochen H   Xu Hanwen H   Wei Lei L   Liu Liyang L   Hu Zhirui Z   Wang Xiaowo X  

Nature communications 20231009 1


Designing promoters with desirable properties is essential in synthetic biology. Human experts are skilled at identifying strong explicit patterns in small samples, while deep learning models excel at detecting implicit weak patterns in large datasets. Biologists have described the sequence patterns of promoters via transcription factor binding sites (TFBSs). However, the flanking sequences of cis-regulatory elements, have long been overlooked and often arbitrarily decided in promoter design. To  ...[more]

Similar Datasets

| S-EPMC9898993 | biostudies-literature
| S-EPMC9997061 | biostudies-literature
| S-EPMC1931564 | biostudies-literature
| S-EPMC525683 | biostudies-literature
| S-EPMC5440218 | biostudies-literature
| S-EPMC8745427 | biostudies-literature
| S-EPMC1855863 | biostudies-literature
| S-EPMC7067682 | biostudies-literature
| S-EPMC9530848 | biostudies-literature
| S-EPMC3439250 | biostudies-literature