<HashMap><database>biostudies-other</database><scores/><additional><omics_type>Unknown</omics_type><submitter>Paul Curnow</submitter><funding>UKRI</funding><species>Escherichia coli</species><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-BSST2184</full_dataset_link><repository>biostudies-other</repository><funding_grant_id>BB/W003449/1</funding_grant_id><funding_grant_id>EP/W524414/1</funding_grant_id><funding_grant_id>BB/T00875X/1</funding_grant_id><pubmed_authors>Paul Curnow</pubmed_authors></additional><is_claimable>false</is_claimable><name>Effective sequence-to-expression prediction for membrane proteins using machine learning and computational protein design</name><description>This study generates a series of integral membrane protein variants using computational sequence design. The aim of the work is to determine the cellular expression profile of this variant library and to use this information to train predictive sequence-to-expression models. Two populations of these proteins are selected by cytometry based on a GFP fusion expression phenotype. Both the 'high' and 'low' expressing populations are independently sequenced by long-read Nanopore sequencing. The files here are (1) raw Nanopore sequencing data, provided for potential future reanalysis and (2) processed data using post-run basecalling with Dorado 0.7.1 and model dna_r10.4.1_e8.2_400bps_sup@v5.0.0.</description><dates><release>2025-09-20T00:00:00Z</release><modification>2026-03-19T03:31:02.729Z</modification><creation>2025-09-19T11:26:37.468Z</creation></dates><accession>S-BSST2184</accession><cross_references/></HashMap>