Unknown

Dataset Information

0

SAPPHIRE.CNN: Implementation of dRNA-seq-driven, species-specific promoter prediction using convolutional neural networks.


ABSTRACT: Data availability is a consistent bottleneck for the development of bacterial species-specific promoter prediction software. In this work we leverage genome-wide promoter datasets generated with dRNA-seq in the Gram-negative bacteria Pseudomonas aeruginosa and Salmonella enterica for promoter prediction. Convolutional neural networks are presented as an optimal architecture for model training and are further modified and tailored for promoter prediction. The resulting predictors reach high binary accuracies (95% and 94.9%) on test sets and outperform each other when predicting promoters in their associated species. SAPPHIRE.CNN is available online and can also be downloaded to run locally. Our results indicate a dependency of binary promoter classification on an organism's GC content and a decreased performance of our classifiers on genera they were not trained for, further supporting the need for dedicated, species-specific promoter classification tools.

SUBMITTER: Coppens L 

PROVIDER: S-EPMC9478156 | biostudies-literature | 2022

REPOSITORIES: biostudies-literature

altmetric image

Publications

SAPPHIRE.CNN: Implementation of dRNA-seq-driven, species-specific promoter prediction using convolutional neural networks.

Coppens Lucas L   Wicke Laura L   Lavigne Rob R  

Computational and structural biotechnology journal 20220909


Data availability is a consistent bottleneck for the development of bacterial species-specific promoter prediction software. In this work we leverage genome-wide promoter datasets generated with dRNA-seq in the Gram-negative bacteria <i>Pseudomonas aeruginosa</i> and <i>Salmonella enterica</i> for promoter prediction. Convolutional neural networks are presented as an optimal architecture for model training and are further modified and tailored for promoter prediction. The resulting predictors re  ...[more]

Similar Datasets

| S-EPMC8609763 | biostudies-literature
| S-EPMC9729992 | biostudies-literature
| S-EPMC8812003 | biostudies-literature
| S-EPMC7689358 | biostudies-literature
| S-EPMC9546588 | biostudies-literature
| S-EPMC5870713 | biostudies-literature
| S-EPMC8682773 | biostudies-literature
| S-EPMC6841655 | biostudies-literature
| S-EPMC7924482 | biostudies-literature
| S-EPMC5932613 | biostudies-literature