Proteogenomic mapping of Mycoplasma hyopneumoniae virulent strain 232.
ABSTRACT: BACKGROUND: Mycoplasma hyopneumoniae causes respiratory disease in swine and contributes to the porcine respiratory disease complex, a major disease problem in the swine industry. The M. hyopneumoniae strain 232 genome is one of the smallest and best annotated microbial genomes, containing only 728 annotated genes and 691 known proteins. Standard protein databases for mass spectrometry only allow for the identification of known and predicted proteins, which if incorrect can limit our understanding of the biological processes at work. Proteogenomic mapping is a methodology which allows the entire 6-frame genome translation of an organism to be used as a mass spectrometry database to help identify unknown proteins as well as correct and confirm existing annotations. This methodology will be employed to perform an in-depth analysis of the M. hyopneumoniae proteome. RESULTS: Proteomic analysis indicates 483 of 691 (70%) known M. hyopneumoniae strain 232 proteins are expressed under the culture conditions given in this study. Furthermore, 171 of 328 (52%) hypothetical proteins have been confirmed. Proteogenomic mapping resulted in the identification of previously unannotated genes gatC and rpmF and 5-prime extensions to genes mhp063, mhp073, and mhp451, all conserved and annotated in other M. hyopneumoniae strains and Mycoplasma species. Gene prediction with Prodigal, a prokaryotic gene predicting program, completely supports the new genomic coordinates calculated using proteogenomic mapping. CONCLUSIONS: Proteogenomic mapping showed that the protein coding genes of the M. hyopneumoniae strain 232 identified in this study are well annotated. Only 1.8% of mapped peptides did not correspond to genes defined by the current genome annotation. This study also illustrates how proteogenomic mapping can be an important tool to help confirm, correct and append known gene models when using a genome sequence as search space for peptide mass spectra. Using a gene prediction program which scans for a wide variety of promoters can help ensure genes are accurately predicted or not missed completely. Furthermore, protein extraction using differential detergent fractionation effectively increases the number of membrane and cytoplasmic proteins identifiable my mass spectrometry.
Project description:Mycoplasma hyopneumoniae is associated with swine respiratory diseases. Although gene organization and regulation are well known in many prokaryotic organisms, knowledge on mycoplasma is limited. This study performed a comparative analysis of three strains of M. hyopneumoniae (7448, J and 232), with a focus on genome organization and gene comparison for open read frame (ORF) cluster (OC) identification. An in silico analysis of gene organization demonstrated 117 OCs and 34 single ORFs in M. hyopneumoniae 7448 and J, while 116 OCs and 36 single ORFs were identified in M. hyopneumoniae 232. Genomic comparison revealed high synteny and conservation of gene order between the OCs defined for 7448 and J strains as well as for 7448 and 232 strains. Twenty-one OCs were chosen and experimentally confirmed by reverse transcription-PCR from M. hyopneumoniae 7448 genome, validating our prediction. A subset of the ORFs within an OC could be independently transcribed due to the presence of internal promoters. Our results suggest that transcription occurs in 'run-on' from an upstream promoter in M. hyopneumoniae, thus forming large ORF clusters (from 2 to 29 ORFs in the same orientation) and indicating a complex transcriptional organization.
Project description:Mycoplasma hyopneumoniae is the cause of enzootic pneumonia in pigs, a chronic respiratory disease associated with significant economic losses to swine producers worldwide. The molecular pathogenesis of infection is poorly understood due to the lack of genetic tools to allow manipulation of the organism and more generally for the Mycoplasma genus. The objective of this study was to develop a system for generating random transposon insertion mutants in M. hyopneumoniae that could prove a powerful tool in enabling the pathogenesis of infection to be unraveled. A novel delivery vector was constructed containing a hyperactive C9 mutant of the Himar1 transposase along with a mini transposon containing the tetracycline resistance cassette, tetM. M. hyopneumoniae strain 232 was electroporated with the construct and tetM-expressing transformants selected on agar containing tetracycline. Individual transformants contained single transposon insertions that were stable upon serial passages in broth medium. The insertion sites of 44 individual transformants were determined and confirmed disruption of several M. hyopneumoniae genes. A large pool of over 10 000 mutants was generated that should allow saturation of the M. hyopneumoniae strain 232 genome. This is the first time that transposon mutagenesis has been demonstrated in this important pathogen and could be generally applied for other Mycoplasma species that are intractable to genetic manipulation. The ability to generate random mutant libraries is a powerful tool in the further study of the pathogenesis of this important swine pathogen.
Project description:BACKGROUND:Proteogenomic mapping is an approach that uses mass spectrometry data from proteins to directly map protein-coding genes and could aid in locating translational regions in the human genome. In concert with the ENcyclopedia of DNA Elements (ENCODE) project, we applied proteogenomic mapping to produce proteogenomic tracks for the UCSC Genome Browser, to explore which putative translational regions may be missing from the human genome. RESULTS:We generated ~1 million high-resolution tandem mass (MS/MS) spectra for Tier 1 ENCODE cell lines K562 and GM12878 and mapped them against the UCSC hg19 human genome, and the GENCODE V7 annotated protein and transcript sets. We then compared the results from the three searches to identify the best-matching peptide for each MS/MS spectrum, thereby increasing the confidence of the putative new protein-coding regions found via the whole genome search. At a 1% false discovery rate, we identified 26,472, 24,406, and 13,128 peptides from the protein, transcript, and whole genome searches, respectively; of these, 481 were found solely via the whole genome search. The proteogenomic mapping data are available on the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUncBsuProt. CONCLUSIONS:The whole genome search revealed that ~4% of the uniquely mapping identified peptides were located outside GENCODE V7 annotated exons. The comparison of the results from the disparate searches also identified 15% more spectra than would have been found solely from a protein database search. Therefore, whole genome proteogenomic mapping is a complementary method for genome annotation when performed in conjunction with other searches.
Project description:Mycoplasma hyopneumoniae and Mycoplasma flocculare are genetically similar bacteria, which coinhabit the porcine respiratory tract. These mycoplasmas share most of the known virulence factors, but, while M. hyopneumoniae causes porcine enzootic pneumonia (PEP), M. flocculare is a commensal species. To identify potential PEP determinants and provide novel insights on mycoplasma-host interactions, the whole cell proteomes of two M. hyopneumoniae strains, one pathogenic (7448) and other non-pathogenic (J), and M. flocculare were compared. A cell fractioning approach combined with mass spectrometry (LC-MS/MS) proteomics was used to analyze cytoplasmic and surface-enriched protein fractions. Average detection of ~ 50% of the predicted proteomes of M. hyopneumoniae 7448 and J, and M. flocculare was achieved. Many of the identified proteins were differentially represented in M. hyopneumoniae 7448 in comparison to M. hyopneumoniae J and M. flocculare, including potential PEP determinants, such as adhesins, proteases, and redox-balancing proteins, among others. The LC-MS/MS data also provided experimental validation for several genes previously regarded as hypothetical for all analyzed mycoplasmas, including some coding for proteins bearing virulence-related functional domains. The comprehensive proteome profiling of two M. hyopneumoniae strains and M. flocculare provided tens of novel candidates to PEP determinants or virulence factors, beyond those classically described.
Project description:Transcriptional regulation, a multiple-step process, is still poorly understood in the important pig pathogen Mycoplasma hyopneumoniae. Basic motifs like promoters and terminators have already been described, but no other cis-regulatory elements have been found. DNA repeat sequences have been shown to be an interesting potential source of cis-regulatory elements. In this work, a genome-wide search for tandem and palindromic repetitive elements was performed in the intergenic regions of all coding sequences from M. hyopneumoniae strain 7448. Computational analysis demonstrated the presence of 144 tandem repeats and 1,171 palindromic elements. The DNA repeat sequences were distributed within the 5' upstream regions of 86% of transcriptional units of M. hyopneumoniae strain 7448. Comparative analysis between distinct repetitive sequences found in related mycoplasma genomes demonstrated different percentages of conservation among pathogenic and nonpathogenic strains. qPCR assays revealed differential expression among genes showing variable numbers of repetitive elements. In addition, repeats found in 206 genes already described to be differentially regulated under different culture conditions of M. hyopneumoniae strain 232 showed almost 80% conservation in relation to M. hyopneumoniae strain 7448 repeats. Altogether, these findings suggest a potential regulatory role of tandem and palindromic DNA repeats in the M. hyopneumoniae transcriptional profile.
Project description:The swine respiratory ciliary epithelium is mainly colonized by Mycoplasma hyopneumoniae, Mycoplasma flocculare and Mycoplasma hyorhinis. While colonization by M. flocculare is virtually asymptomatic, M. hyopneumoniae and M. hyorhinis infections may cause respiratory disease. Information regarding transcript structure and gene abundance provides valuable insight into gene function and regulation, which has not yet been analyzed on a genome-wide scale in these Mycoplasma species. In this study, we report the construction of transcriptome maps for M. hyopneumoniae, M. flocculare and M. hyorhinis, which represent data for conducting comparative studies on the transcriptional repertory. For each species, three cDNA libraries were generated, yielding averages of 415,265, 695,313 and 93,578 reads for M. hyopneumoniae, M. flocculare and M. hyorhinis, respectively, with an average read length of 274 bp. The reads mapping showed that 92%, 98% and 96% of the predicted genes were transcribed in the M. hyopneumoniae, M. flocculare and M. hyorhinis genomes, respectively. Moreover, we showed that the majority of the genes are co-expressed, confirming the previously predicted transcription units. Finally, our data defined the RNA populations in detail, with the map transcript boundaries and transcription unit structures on a genome-wide scale.
Project description:The characterization of the repertoire of proteins exposed on the cell surface by Mycoplasma hyopneumoniae (M. hyopneumoniae), the etiological agent of enzootic pneumonia in pigs, is critical to understand physiological processes associated with bacterial infection capacity, survival and pathogenesis. Previous in silico studies predicted that about a third of the genes in the M. hyopneumoniae genome code for surface proteins, but so far, just a few of them have experimental confirmation of their expression and surface localization. In this work, M. hyopneumoniae surface proteins were labeled in intact cells with biotin, and affinity-captured biotin-labeled proteins were identified by a gel-based liquid chromatography-tandem mass spectrometry approach. A total of 20 gel slices were separately analyzed by mass spectrometry, resulting in 165 protein identifications corresponding to 59 different protein species. The identified surface exposed proteins better defined the set of M. hyopneumoniae proteins exposed to the host and added confidence to in silico predictions. Several proteins potentially related to pathogenesis, were identified, including known adhesins and also hypothetical proteins with adhesin-like topologies, consisting of a transmembrane helix and a large tail exposed at the cell surface. The results provided a better picture of the M. hyopneumoniae cell surface that will help in the understanding of processes important for bacterial pathogenesis. Considering the experimental demonstration of surface exposure, adhesion-like topology predictions and absence of orthologs in the closely related, non-pathogenic species Mycoplasma flocculare, several proteins could be proposed as potential targets for the development of drugs, vaccines and/or immunodiagnostic tests for enzootic pneumonia.
Project description:Enzootic pneumonia caused by Mycoplasma hyopneumoniae is a major constraint to efficient pork production throughout the world. This pathogen has a small genome with 716 coding sequences, of which 418 are homologous to proteins with known functions. However, almost 42% of the 716 coding sequences are annotated as hypothetical proteins. Alternative methodologies such as threading and comparative modeling can be used to predict structures and functions of such hypothetical proteins. Often, these alternative methods can answer questions about the properties of a model system faster than experiments. In this study, we predicted the structures of seven proteins annotated as hypothetical in M. hyopneumoniae, using the structure-based approaches mentioned above. Three proteins were predicted to be involved in metabolic processes, two proteins in transcription and two proteins where no function could be assigned. However, the modeled structures of the last two proteins suggested experimental designs to identify their functions. Our findings are important in diminishing the gap between the lack of annotation of important metabolic pathways and the great number of hypothetical proteins in the M. hyopneumoniae genome.
Project description:Mycoplasma hyopneumoniae is the etiological agent of swine enzootic pneumonia, resulting in considerable economic losses in the swine industry. A few genome sequences of M. hyopneumoniae have been reported to date, implying that additional genome data are needed for further genetic studies. Here, we present the annotated genome sequence of M. hyopneumoniae strain KM014.
Project description:UNLABELLED: Mycoplasma hyopneumoniae causes enormous economic losses to swine production worldwide by colonizing the ciliated epithelium in the porcine respiratory tract, resulting in widespread damage to the mucociliary escalator, prolonged inflammation, reduced weight gain, and secondary infections. Protein Mhp684 (P146) comprises 1,317 amino acids, and while the N-terminal 400 residues display significant sequence identity to the archetype cilium adhesin P97, the remainder of the molecule is novel and displays unusual motifs. Proteome analysis shows that P146 preprotein is endogenously cleaved into three major fragments identified here as P50(P146), P40(P146), and P85(P146) that reside on the cell surface. Liquid chromatography with tandem mass spectrometry (LC-MS/MS) identified a semitryptic peptide that delineated a major cleavage site in Mhp684. Cleavage occurred at the phenylalanine residue within sequence (672)ATEF?QQ(677), consistent with a cleavage motif resembling S/T-X-F?X-D/E recently identified in Mhp683 and other P97/P102 family members. Biotinylated surface proteins recovered by avidin chromatography and separated by two-dimensional gel electrophoresis (2-D GE) showed that more-extensive endoproteolytic cleavage of P146 occurs. Recombinant fragments F1(P146)-F3(P146) that mimic P50(P146), P40(P146), and P85(P146) were constructed and shown to bind porcine epithelial cilia and biotinylated heparin with physiologically relevant affinity. Recombinant versions of F3(P146) generated from M. hyopneumoniae strain J and 232 sequences strongly bind porcine plasminogen, and the removal of their respective C-terminal lysine and arginine residues significantly reduces this interaction. These data reveal that P146 is an extensively processed, multifunctional adhesin of M. hyopneumoniae. Extensive cleavage coupled with variable cleavage efficiency provides a mechanism by which M. hyopneumoniae regulates protein topography. IMPORTANCE: Vaccines used to control Mycoplasma hyopneumoniae infection provide only partial protection. Proteins of the P97/P102 families are highly expressed, functionally redundant molecules that are substrates of endoproteases that generate multifunctional adhesin fragments on the cell surface. We show that P146 displays a chimeric structure consisting of an N terminus, which shares sequence identity with P97, and novel central and C-terminal regions. P146 is endoproteolytically processed at multiple sites, generating at least nine fragments on the surface of M. hyopneumoniae. Dominant cleavage events occurred at S/T-X-F?X-D/E-like sites generating P50(P146), P40(P146), and P85(P146). Recombinant proteins designed to mimic the major cleavage fragments bind porcine cilia, heparin, and plasminogen. P146 undergoes endoproteolytic processing events at multiple sites and with differential processing efficiency, generating combinatorial diversity on the surface of M. hyopneumoniae.