Unknown

Dataset Information

0

Effective sequence-to-expression prediction for membrane proteins using machine learning and computational protein design


ABSTRACT: This study generates a series of integral membrane protein variants using computational sequence design. The aim of the work is to determine the cellular expression profile of this variant library and to use this information to train predictive sequence-to-expression models. Two populations of these proteins are selected by cytometry based on a GFP fusion expression phenotype. Both the 'high' and 'low' expressing populations are independently sequenced by long-read Nanopore sequencing. The files here are (1) raw Nanopore sequencing data, provided for potential future reanalysis and (2) processed data using post-run basecalling with Dorado 0.7.1 and model dna_r10.4.1_e8.2_400bps_sup@v5.0.0.

ORGANISM(S): Escherichia coli

SUBMITTER: Paul Curnow 

PROVIDER: S-BSST2184 | biostudies-other |

REPOSITORIES: biostudies-other

Similar Datasets

| S-EPMC8804200 | biostudies-literature
2024-07-05 | GSE237017 | GEO
2013-01-01 | E-GEOD-29210 | biostudies-arrayexpress
| S-EPMC5734395 | biostudies-literature
2021-06-01 | GSE171549 | GEO
| S-EPMC8847649 | biostudies-literature
| S-EPMC9842610 | biostudies-literature
| S-EPMC4108999 | biostudies-literature
2021-06-02 | GSE175942 | GEO
2013-01-01 | GSE29210 | GEO