A novel computational method identifies intra- and inter-species recombination events in Staphylococcus aureus and Streptococcus pneumoniae.
ABSTRACT: Advances in high-throughput DNA sequencing technologies have determined an explosion in the number of sequenced bacterial genomes. Comparative sequence analysis frequently reveals evidences of homologous recombination occurring with different mechanisms and rates in different species, but the large-scale use of computational methods to identify recombination events is hampered by their high computational costs. Here, we propose a new method to identify recombination events in large datasets of whole genome sequences. Using a filtering procedure of the gene conservation profiles of a test genome against a panel of strains, this algorithm identifies sets of contiguous genes acquired by homologous recombination. The locations of the recombination breakpoints are determined using a statistical test that is able to account for the differences in the natural rate of evolution between different genes. The algorithm was tested on a dataset of 75 genomes of Staphylococcus aureus and 50 genomes comprising different streptococcal species, and was able to detect intra-species recombination events in S. aureus and in Streptococcus pneumoniae. Furthermore, we found evidences of an inter-species exchange of genetic material between S. pneumoniae and Streptococcus mitis, a closely related commensal species that colonizes the same ecological niche. The method has been implemented in an R package, Reco, which is freely available from supplementary material, and provides a rapid screening tool to investigate recombination on a genome-wide scale from sequence data.
Project description:Streptococcus pneumoniae is one of the most important causes of microbial diseases in humans. The genomes of 44 diverse strains of S. pneumoniae were analyzed and compared with strains of non-pathogenic streptococci of the Mitis group.Despite evidence of extensive recombination, the S. pneumoniae phylogenetic tree revealed six major lineages. With the exception of serotype 1, the tree correlated poorly with capsular serotype, geographical site of isolation and disease outcome. The distribution of dispensable genes--genes present in more than one strain but not in all strains--was consistent with phylogeny, although horizontal gene transfer events attenuated this correlation in the case of ancient lineages. Homologous recombination, involving short stretches of DNA, was the dominant evolutionary process of the core genome of S. pneumoniae. Genetic exchange occurred both within and across the borders of the species, and S. mitis was the main reservoir of genetic diversity of S. pneumoniae. The pan-genome size of S. pneumoniae increased logarithmically with the number of strains and linearly with the number of polymorphic sites of the sampled genomes, suggesting that acquired genes accumulate proportionately to the age of clones. Most genes associated with pathogenicity were shared by all S. pneumoniae strains, but were also present in S. mitis, S. oralis and S. infantis, indicating that these genes are not sufficient to determine virulence.Genetic exchange with related species sharing the same ecological niche is the main mechanism of evolution of S. pneumoniae. The open pan-genome guarantees the species a quick and economical response to diverse environments.
Project description:Antibiotic resistance in Streptococcus pneumoniae is often the result of horizontal gene transfer events involving closely related streptococcal species. Laboratory experiments confirmed that S. mitis DNA functions as donor in transformation experiments, using the laboratory strain S. pneumoniae R6 as recipient and chromosomal DNA of a high level penicillin resistant S. mitis B6 strain. After four transformation steps, alterations in five penicillin-binding proteins (PBP) were observed, and sequence analysis confirmed recombination events in the corresponding PBP genes. In order to detect regions where recombination with S. mitis DNA has occurred we analyzed the S. pneumoniae transformants by microarray analyses, using oligonucleotide microarrays designed for the S. pneumoniae genome and the S. mitis B6 genome as well.
Project description:We present a comparative analysis of predicted highly expressed (PHX) genes in the low G+C Gram-positive genomes of Bacillus subtilis, Bacillus halodurans, Listeria monocytogenes, Listeria innocua, Lactococcus lactis, Streptococcus pyogenes, Streptococcus pneumoniae, Staphylococcus aureus, Clostridium acetobutylicum, and Clostridium perfringens. Most enzymes acting in glycolysis and fermentation pathways are PHX in these genomes, but not those involved in the TCA cycle and respiration, suggesting that these organisms have predominantly adapted to grow rapidly in an anaerobic environment. Only B. subtilis and B. halodurans have several TCA cycle PHX genes, whereas the TCA pathway is entirely missing from the metabolic repertoire of the two Streptococcus species and is incomplete in Listeria, Lactococcus, and Clostridium. Pyruvate-formate lyase, an enzyme critical in mixed acid fermentation, is among the highest PHX genes in all these genomes except for C. acetobutylicum (not PHX), and B. subtilis, and B. halodurans (missing). Pyruvate-formate lyase is also prominently PHX in enteric gamma-proteobacteria, but not in other prokaryotes. Phosphotransferase system genes are generally PHX with selection of different substrates in different genomes. The various substrate specificities among phosphotransferase systems in different genomes apparently reflect on differences in habitat, lifestyle, and nutrient sources.
Project description:The identification of clones within bacterial populations is often taken as evidence for a low rate of recombination, but the validity of this inference is rarely examined. We have used statistical tests of congruence between gene trees to examine the extent and significance of recombination in six bacterial pathogens. For Neisseria meningitidis, Streptococcus pneumoniae, Streptococcus pyogenes, and Staphylococcus aureus, the congruence between the maximum likelihood trees reconstructed using seven house-keeping genes was in most cases no better than that between each tree and trees of random topology. The lack of congruence between gene trees in these four species, which include both naturally transformable and nontransformable species, is in three cases supported by high ratios of recombination to point mutation during clonal diversification (estimates of this parameter were not possible for Strep. pyogenes). In contrast, gene trees constructed for Hemophilus influenzae and pathogenic isolates of Escherichia coli showed a higher degree of congruence, suggesting lower rates of recombination. The impact of recombination therefore varies between bacterial species but in many species is sufficient to obliterate the phylogenetic signal in gene trees.
Project description:For more than a decade, pan-genome analysis has been applied as an effective method for explaining the genetic contents variation of prokaryotic species. However, genomic characteristics and detailed structures of gene pools have not been fully clarified, because most studies have used a small number of genomes. Here, we constructed pan-genomes of seven species in order to elucidate variations in the genetic contents of >27,000 genomes belonging to Streptococcus pneumoniae, Staphylococcus aureus subsp. aureus, Salmonella enterica subsp. enterica, Escherichia coli and Shigella spp., Mycobacterium tuberculosis complex, Pseudomonas aeruginosa, and Acinetobacter baumannii. This work showed the pan-genomes of all seven species has open property. Additionally, systematic evaluation of the characteristics of their pan-genome revealed that phylogenetic distance provided valuable information for estimating the parameters for pan-genome size among several models including Heaps' law. Our results provide a better understanding of the species and a solution to minimize sampling biases associated with genome-sequencing preferences for pathogenic strains.
Project description:Distinuishing the species of mitis group streptococci is challenging due to ambiguous phenotypic characteristics and high degree of genetic similarity. This has been particularly true for resolving atypical Streptococcus pneumoniae and Streptococcus pseudopneumoniae. We used phylogenetic clustering to demonstrate specific and separate clades for both S. pneumoniae and S. pseudopneumoniae genomes. The genomes that clustered within these defined clades were used to extract species-specific genes from the pan-genome. The S. pneumoniae marker was detected in 8027 out of 8051 (>99.7?%) S. pneumoniae genomes. The S. pseudopneumoniae marker was specific for all genomes that clustered in the S. pseudopneumoniae clade, including unresolved species of the genus Streptococcus sequenced by the BC Centre for Disease Control Public Health Laboratory that previously could not be distinguished by other methods. Other than the presence of the S. pseudopneumoniae marker in six of 8051 (<0.08?%) S. pneumoniae genomes, both the S. pneumoniae and S. pseudopneumoniae markers showed little to no detectable cross-reactivity to the genomes of any other species of the genus Streptococcus or to a panel of over 46?000?genomes from viral, fungal, bacterial pathogens and microbiota commonly found in the respiratory tract. A real-time PCR assay was designed targeting these two markers. Genomics provides a useful technique for PCR assay design and development.
Project description:Transformation is an important mechanism of microbial evolution through which bacteria have been observed to rapidly adapt in response to clinical interventions; examples include facilitating vaccine evasion and the development of penicillin resistance in the major respiratory pathogen Streptococcus pneumoniae. To characterise the process in detail, the genomes of 124 S. pneumoniae isolates produced through in vitro transformation were sequenced and recombination events detected. Those recombinations importing the selected marker were independent of unselected events elsewhere in the genome, the positions of which were not significantly affected by local sequence similarity between donor and recipient or mismatch repair processes. However, both types of recombinations were sometimes mosaic, with multiple non-contiguous segments originating from the same molecule of donor DNA. The lengths of the unselected events were exponentially distributed with a mean of 2.3 kb, implying that recombinations are stochastically resolved with a fixed per base probability of 4.4×10(-4) bp(-1). This distribution of recombination sizes, coupled with an observed under representation of large insertions within transferred sequence, suggests transformation has the potential to reduce the size of bacterial genomes, and is unlikely to act as an efficient mechanism for the uptake of accessory genomic loci.
Project description:Prokaryotic evolution is affected by horizontal transfer of genetic material through recombination. Inference of an evolutionary tree of bacteria thus relies on accurate identification of the population genetic structure and recombination-derived mosaicism. Rapidly growing databases represent a challenge for computational methods to detect recombinations in bacterial genomes. We introduce a novel algorithm called fastGEAR which identifies lineages in diverse microbial alignments, and recombinations between them and from external origins. The algorithm detects both recent recombinations (affecting a few isolates) and ancestral recombinations between detected lineages (affecting entire lineages), thus providing insight into recombinations affecting deep branches of the phylogenetic tree. In simulations, fastGEAR had comparable power to detect recent recombinations and outstanding power to detect the ancestral ones, compared with state-of-the-art methods, often with a fraction of computational cost. We demonstrate the utility of the method by analyzing a collection of 616 whole-genomes of a recombinogenic pathogen Streptococcus pneumoniae, for which the method provided a high-resolution view of recombination across the genome. We examined in detail the penicillin-binding genes across the Streptococcus genus, demonstrating previously undetected genetic exchanges between different species at these three loci. Hence, fastGEAR can be readily applied to investigate mosaicism in bacterial genes across multiple species. Finally, fastGEAR correctly identified many known recombination hotspots and pointed to potential new ones. Matlab code and Linux/Windows executables are available at https://users.ics.aalto.fi/~pemartti/fastGEAR/ (last accessed February 6, 2017).
Project description:According to a highly polymorphic region in the lytA gene, encoding the major autolysin of Streptococcus pneumoniae, two different families of alleles can be differentiated by PCR and restriction digestion. Here, we provide evidence that this polymorphic region arose from recombination events with homologous genes of pneumococcal temperate phages.
Project description:Polypeptide deformylase (PDF) catalyzes the deformylation of polypeptide chains in bacteria. It is essential for bacterial cell viability and is a potential antibacterial drug target. Here, we report the crystal structures of polypeptide deformylase from four different species of bacteria: Streptococcus pneumoniae, Staphylococcus aureus, Haemophilus influenzae, and Escherichia coli. Comparison of these four structures reveals significant overall differences between the two Gram-negative species (E. coli and H. influenzae) and the two Gram-positive species (S. pneumoniae and S. aureus). Despite these differences and low overall sequence identity, the S1' pocket of PDF is well conserved among the four enzymes studied. We also describe the binding of nonpeptidic inhibitor molecules SB-485345, SB-543668, and SB-505684 to both S. pneumoniae and E. coli PDF. Comparison of these structures shows similar binding interactions with both Gram-negative and Gram-positive species. Understanding the similarities and subtle differences in active site structure between species will help to design broad-spectrum polypeptide deformylase inhibitor molecules.