Unveiling the small protein landscape of uropathogenic E. coli
Ontology highlight
ABSTRACT: Uropathogenic Escherichia coli (UPEC) is the predominant causative agent of urinary tract infections, responsible for millions of infection cases each year. The UPEC strain, CFT073, shares only about 40% of its protein-coding genes with the E. coli laboratory strain MG1655. Its mosaic genome contains 60 large genomic islands derived from horizontal gene transfer and mobile elements. Many of its virulence genes reside in these pathogenicity islands. In this study, we explored the small proteome of CFT073 using ribosomal profiling coupled with RNA sequencing, and identified 138 unannotated small open reading frames (sORFs) in intergenic regions. The translation efficiency of all these ORFs was greater than one, indicating that the mRNA transcripts were occupied by ribosomes and undergoing active translation. Thirty of the sORFs were annotated in other genomes; the remaining 108 sORFs encode hypothetical or novel proteins. Reanalysis of conventional proteomic data confirmed protein translation from five of these sORFs. The prediction of transmembrane helices indicated that 8 small proteins encoded by novel sORFs may be associated with lipid membranes. Notably, 5 out of 8 have their sORFs in pathogenicity islands of the genome. We tested their membrane localization using a cell-free system containing lipid sponge droplets. Our study revealed the small proteome of an extraintestinal pathogenic E. coli and provided a catalog of sORFs for further investigation.
ORGANISM(S): Escherichia coli CFT073
PROVIDER: GSE301040 | GEO | 2026/06/30
REPOSITORIES: GEO
ACCESS DATA