Project description:Given the facilities for whole genome sequencing with next-generation sequencers, structural and functional gene annotation is now only based on automated prediction. However, errors in terms of gene structure are still frequently reported especially for the correct determination of initiation start codons. Here, we propose a strategy to enrich and detect protein N-termini by mass spectrometry in order to refine genome annotation. After selective protein N-termini derivatization using (N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPPAc-OSu) as labeling reagent, protein digestion was performed with three proteases in parallel. TMPP-labeled N-terminal-most peptides were further resolved from internal peptides by the COmbined FRActional DIagonal Chromatography (COFRADIC) sorting methodology before analysis with tandem mass spectrometry. We refined the annotation of the genome of a model marine bacterium, Roseobacter denitrificans.
Project description:Identifying all essential genomic components is critical for the assembly of minimal artificial life. In the genome-reduced bacterium Mycoplasma pneumoniae, we found, that small ORFs (smORFs; <100 residues), accounting for 10% of all ORFs, are the most frequently essential genomic components (53%). Essentiality of smORFs may be explained by their function as members of protein and/or DNA/RNA complexes. In larger proteins, essentiality applied to individual domains and not entire proteins, a notion we could confirm by expression of truncated domains. The fraction of essential non-coding RNAs non-overlapping with essential genes is 5% higher than of non-transcribed regions (0.9%), pointing to the important functions of the former. The data highlights the minimal genome (33%, 269410 bp of M. pneumoniae genome), with an unexpected hidden layer of smORFs with essential functions.