Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.
ABSTRACT: Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47) required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP) arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected ?11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (?63bp) in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75%) steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs) near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin ?3 (PSM?3), a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.
Project description:Here, the genes encoding three different fluorescent proteins were cloned into the stably maintained Staphylococcus aureus shuttle vector pKK30. The resulting plasmids were transformed into two S. aureus strains; SH1000 and RN4220. Stability assays illustrated that the three recombinant plasmids retained near 100% maintenance in vitro for 160 generations. S. aureus strain SH1000 expressing green fluorescent protein was then inoculated in an ovine model and in vivo stability for 6 days was demonstrated. In essence, these reporter plasmids represent a useful set of tools for dynamic imaging studies in S. aureus. These three reporter plasmids are available through BEI Resources.
Project description:In most Staphylococcus aureus strains, inactivation of sarA increases hla transcription, indicating that sarA is a repressor. However, in S. aureus NCTC 8325 and its derivatives, used for most studies of hla regulation, inactivation of sarA resulted in decreased hla transcription. The disparate phenotype of strain NCTC 8325 seems to be associated with its rsbU mutation, which leads to sigma(B) deficiency. This has now been verified by the demonstration that sarA repressed hla transcription in an rsbU+ derivative of strain 8325-4 (SH1000). That sarA could act as a repressor of hla in an 8325-4 background was confirmed by the observation that inactivation of sarA in an agr sarS rot triple mutant dramatically increased hla transcription to wild-type levels. However, the apparent role of sarA as an activator of hla in 8325-4 was not a result of the rsbU mutation alone, as inactivation of sarA in another rsbU mutant, strain V8, led to increased hla transcription. Northern blot analysis revealed much higher levels of sarS mRNA in strain V8 than in 8325-4, which was likely due to the mutation in the sarS activator, tcaR, in 8325-4, which was not found in strain V8. On the other hand, the relative increase in sarS transcription upon the inactivation of sarA was 15-fold higher in 8325-4 than in strain V8. Because of this, inactivation of sarA in 8325-4 means a net increase in repressor activity, whereas in strain V8, inactivation of sarA means a net decrease in repressor activity and, therefore, enhanced hla transcription.
Project description:Sequence simulation is an important tool in validating biological hypotheses as well as testing various bioinformatics and molecular evolutionary methods. Hypothesis testing relies on the representational ability of the sequence simulation method. Simple hypotheses are testable through simulation of random, homogeneously evolving sequence sets. However, testing complex hypotheses, for example, local similarities, requires simulation of sequence evolution under heterogeneous models. To this end, we previously introduced indel-Seq-Gen version 1.0 (iSGv1.0; indel, insertion/deletion). iSGv1.0 allowed heterogeneous protein evolution and motif conservation as well as insertion and deletion constraints in subsequences. Despite these advances, for complex hypothesis testing, neither iSGv1.0 nor other currently available sequence simulation methods is sufficient. indel-Seq-Gen version 2.0 (iSGv2.0) aims at simulating evolution of highly divergent DNA sequences and protein superfamilies. iSGv2.0 improves upon iSGv1.0 through the addition of lineage-specific evolution, motif conservation using PROSITE-like regular expressions, indel tracking, subsequence-length constraints, as well as coding and noncoding DNA evolution. Furthermore, we formalize the sequence representation used for iSGv2.0 and uncover a flaw in the modeling of indels used in current state of the art methods, which biases simulation results for hypotheses involving indels. We fix this flaw in iSGv2.0 by using a novel discrete stepping procedure. Finally, we present an example simulation of the calycin-superfamily sequences and compare the performance of iSGv2.0 with iSGv1.0 and random model of sequence evolution.
Project description:BACKGROUND: Insertions and deletions (indels) represent a common type of sequence variations, which are less studied and pose many important biological questions. Recent research has shown that the presence of sizable indels in protein sequences may be indicative of protein essentiality and their role in protein interaction networks. Examples of utilization of indels for structure-based drug design have also been recently demonstrated. Nonetheless many structural and functional characteristics of indels remain less researched or unknown. DESCRIPTION: We have created a web-based resource, Indel PDB, representing a structural database of insertions/deletions identified from the sequence alignments of highly similar proteins found in the Protein Data Bank (PDB). Indel PDB utilized large amounts of available structural information to characterize 1-, 2- and 3-dimensional features of indel sites. Indel PDB contains 117,266 non-redundant indel sites extracted from 11,294 indel-containing proteins. Unlike loop databases, Indel PDB features more indel sequences with secondary structures including alpha-helices and beta-sheets in addition to loops. The insertion fragments have been characterized by their sequences, lengths, locations, secondary structure composition, solvent accessibility, protein domain association and three dimensional structures. CONCLUSION: By utilizing the data available in Indel PDB, we have studied and presented here several sequence and structural features of indels. We anticipate that Indel PDB will not only enable future functional studies of indels, but will also assist protein modeling efforts and identification of indel-directed drug binding sites.
Project description:Taking advantage of the deep targeted sequencing capabilities of next generation sequencers, we have developed a novel two step insertion deletion (indel) detection algorithm (IDA) that can determine indels from single read sequences with high computational efficiency and sensitivity when indels are fractionally less compared to wild type reference sequence. First, it identifies candidate indel positions utilizing specific sequence alignment artifacts produced by rapid alignment programs. Second, it confirms the location of the candidate indel by using the Smith-Waterman (SW) algorithm on a restricted subset of Sequence reads. We demonstrate that IDA is applicable to indels of varying sizes from deep targeted sequencing data at low fractions where the indel is diluted by wild type sequence. Our algorithm is useful in detecting indel variants present at variable allelic frequencies such as may occur in heterozygotes and mixed normal-tumor tissue.
Project description:Indel mutations play key roles in genome and protein evolution, yet we lack a comprehensive understanding of how indels impact evolutionary processes. Genome-wide analyses enabled by next-generation sequencing can clarify the context and effect of indels, thereby integrating a more detailed consideration of indels with our knowledge of nucleotide substitutions. To this end, we sequenced Blochmannia chromaiodes, an obligate bacterial endosymbiont of carpenter ants, and compared it with the close relative, B. pennsylvanicus. The genetic distance between these species is small enough for accurate whole genome alignment but large enough to provide a meaningful spectrum of indel mutations. We found that indels are subjected to purifying selection in coding regions and even intergenic regions, which show a reduced rate of indel base pairs per kilobase compared with nonfunctional pseudogenes. Indels occur almost exclusively in repeat regions composed of homopolymers and multimeric simple sequence repeats, demonstrating the importance of sequence context for indel mutations. Despite purifying selection, some indels occur in protein-coding genes. Most are multiples of three, indicating selective pressure to maintain the reading frame. The deleterious effect of frameshift-inducing indels is minimized by either compensation from a nearby indel to restore reading frame or the indel's location near the 3'-end of the gene. We observed amino acid divergence exceeding nucleotide divergence in regions affected by frameshift-inducing indels, suggesting that these indels may either drive adaptive protein evolution or initiate gene degradation. Our results shed light on how indel mutations impact processes of molecular evolution underlying endosymbiont genome evolution.
Project description:Evolutionary constraint for insertions and deletions (indels) is not necessarily equal to constraint for nucleotide substitutions for any given region of a genome. Knowing the variation in indel-specific evolutionary rates across the sequence will aid our understanding of evolutionary constraints on indels, and help us infer how indels have contributed to the evolution of the sequence. However, unlike for nucleotide substitutions, there has been no phylogenetic method that can statistically infer significantly different rates of indels across the sequence space independent of substitution rates. Here, we have developed a software that will find sites with accelerated evolutionary rates specific to indels, by introducing a scaling parameter that only applies to the indel rates and not to the nucleotide substitution rates. Using the software, we show that we can find regions of accelerated rates of indels in the protein alignments of primate genomes. We also confirm that the sites that have high rates of indels are different from the sites that have high rates of nucleotide substitutions within the protein sequences. By identifying regions with accelerated rates of indels independent of nucleotide substitutions, we will be able to better understand the impact of indel mutations on protein sequence evolution.
Project description:The Sau1 type I restriction-modification system is found on the chromosome of all nine sequenced strains of Staphylococcus aureus and includes a single hsdR (restriction) gene and two copies of hsdM (modification) and hsdS (sequence specificity) genes. The strain S. aureus RN4220 is a vital intermediate for laboratory S. aureus manipulation, as it can accept plasmid DNA from Escherichia coli. We show that it carries a mutation in the sau1hsdR gene and that complementation restored a nontransformable phenotype. Sau1 was also responsible for reduced conjugative transfer from enterococci, a model of vancomycin resistance transfer. This may explain why only four vancomycin-resistant S. aureus strains have been identified despite substantial selective pressure in the clinical setting. Using a multistrain S. aureus microarray, we show that the two copies of sequence specificity genes (sau1hsdS1 and sau1hsdS2) vary substantially between isolates and that the variation corresponds to the 10 dominant S. aureus lineages. Thus, RN4220 complemented with sau1hsdR was resistant to bacteriophage lysis but only if the phage was grown on S. aureus of a different lineage. Similarly, it could be transduced with DNA from its own lineage but not with the phage grown on different S. aureus lineages. Therefore, we propose that Sau1 is the major mechanism for blocking transfer of resistance genes and other mobile genetic elements into S. aureus isolates from other species, as well as for controlling the spread of resistance genes between isolates of different S. aureus lineages. Blocking Sau1 should also allow genetic manipulation of clinical strains of S. aureus.
Project description:Leptospirosis is a worldwide zoonosis, responsible for more than 1 million cases and 60,000 deaths every year. Among the 13 pathogenic species of the genus Leptospira, serovars belonging to L. interrogans serogroup Icterohaemorrhagiae are considered to be the most virulent strains, and responsible for majority of the reported severe cases. Serovars Copenhageni and Icterohaemorrhagiae are major representatives of this serogroup and despite their public health relevance, little is known regarding the genetic differences between these two serovars. In this study, we analyzed the genome sequences of 67 isolates belonging to L. interrogans serovars Copenhageni and Icterohaemorrhagiae to investigate the influence of spatial and temporal variations on DNA sequence diversity. Out of the 1072 SNPs identified, 276 were in non-coding regions and 796 in coding regions. Indel analyses identified 258 indels, out of which 191 were found in coding regions and 67 in non-coding regions. Our phylogenetic analyses based on SNP dataset revealed that both serovars are closely related but showed distinct spatial clustering. However, likelihood ratio test of the indel data statistically confirmed the presence of a frameshift mutation within a homopolymeric tract of lic12008 gene (related to LPS biosynthesis) in all the L. interrogans serovar Icterohaemorrhagiae strains but not in the Copenhageni strains. Therefore, this internal indel identified can genetically distinguish L. interrogans serovar Copenhageni from serovar Icterohaemorrhagiae with high discriminatory power. To our knowledge, this is the first study to identify global sequence variations (SNPs and Indels) in L. interrogans serovars Copenhageni and Icterohaemorrhagiae.
Project description:Induction of HIV-1 broad neutralizing antibodies (bnAbs) is a goal of HIV-1 vaccine development but has remained challenging partially due to unusual traits of bnAbs, including high somatic hypermutation (SHM) frequencies and in-frame insertions and deletions (indels). Here we examined the propensity and functional requirement for indels within HIV-1 bnAbs. High-throughput sequencing of the immunoglobulin (Ig) VHDJH genes in HIV-1 infected and uninfected individuals revealed that the indel frequency was elevated among HIV-1-infected subjects, with no unique properties attributable to bnAb-producing individuals. This increased indel occurrence depended only on the frequency of SHM point mutations. Indel-encoded regions were generally proximal to antigen binding sites. Additionally, reconstruction of a HIV-1 CD4-binding site bnAb clonal lineage revealed that a large compound VHDJH indel was required for bnAb activity. Thus, vaccine development should focus on designing regimens targeted at sustained activation of bnAb lineages to achieve the required SHM and indel events.