Project description:A chromium-reducing bacterium designated as strain KNP was isolated from a sample collected from a tannery effluent of Kanpur, India. Phylogenetic analysis based on the 16S rRNA gene sequences revealed that strain KNP belonged to the <i>Bacillus</i> genus and showed 100% similarity with <i>Bacillus licheniformis</i>. Furthermore, average nucleotide identity and digital DNA-DNA hybridization between strain KNP and its closely related strains confirmed its affiliation with <i>Bacillus licheniformis</i> species<i>.</i> Whole-genome sequencing of <i>Bacillus licheniformis</i> KNP was performed using the Illumina Hiseq platform. Here, we present the draft genome sequence of <i>Bacillus licheniformis</i> KNP. The total size of the draft assembly was 4,280,093?bp, distributed into 21 contigs with an N50 value of 4,186,229. The genome has 45.9% G?+?C content, 4255 coding sequences and 86 putative RNA genes. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession JACDXS000000000. The version described in this paper is version JACDXS010000000.
Project description:BACKGROUND: Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. RESULTS: We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. CONCLUSIONS: The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks. Multiple genome-level comparisons among five closely related Bacillus species were also carried out. The determined genome sequence of B. subtilis natto and gene annotations are available from the Natto genome browser http://natto-genome.org/.
Project description:<h4>Background</h4>Recent progress in next-generation sequencing technology has afforded several improvements such as ultra-high throughput at low cost, very high read quality, and substantially increased sequencing depth. State-of-the-art high-throughput sequencers, such as the Illumina MiSeq system, can generate ~15 Gbp sequencing data per run, with >80% bases above Q30 and a sequencing depth of up to several 1000x for small genomes. Illumina HiSeq 2500 is capable of generating up to 1 Tbp per run, with >80% bases above Q30 and often >100x sequencing depth for large genomes. To speed up otherwise time-consuming genome assembly and/or to obtain a skeleton of the assembly quickly for scaffolding or progressive assembly, methods for noise removal and reduction of redundancy in the original data, with almost equal or better assembly results, are worth studying.<h4>Results</h4>We developed two subset selection methods for single-end reads and a method for paired-end reads based on base quality scores and other read analytic tools using the MapReduce framework. We proposed two strategies to select reads: MinimalQ and ProductQ. MinimalQ selects reads with minimal base-quality above a threshold. ProductQ selects reads with probability of no incorrect base above a threshold. In the single-end experiments, we used Escherichia coli and Bacillus cereus datasets of MiSeq, Velvet assembler for genome assembly, and GAGE benchmark tools for result evaluation. In the paired-end experiments, we used the giant grouper (Epinephelus lanceolatus) dataset of HiSeq, ALLPATHS-LG genome assembler, and QUAST quality assessment tool for comparing genome assemblies of the original set and the subset. The results show that subset selection not only can speed up the genome assembly but also can produce substantially longer scaffolds.<h4>Availability</h4>The software is freely available at https://github.com/moneycat/QReadSelector.
Project description:We report the draft genome sequences of Bacillus glennii V44-8, Bacillus saganii V47-23a, and Bacillus sp. strain V59.32b, isolated from the Viking spacecraft assembly cleanroom, and Bacillus sp. strain MER_TA_151 and Paenibacillus sp. strain MER_111, isolated from the Mars Exploration Rover (MER) assembly cleanroom.
Project description:A <i>Bacillus velezensis</i> strain from the rhizosphere of <i>Sporobolus airoides</i> (Torr.) Torr<i>.</i>, a grass in central-north México, was isolated during a biocontrol of phytopathogens scrutiny study. The 2A-2B strain exhibited at least 60% of growth inhibition of virulent isolates of phytopathogens causing root rot. These phytopathogens include <i>Phytophthora capsici</i>, <i>Fusarium solani</i>, <i>Fusarium oxysporum</i> and <i>Rhizoctonia solani</i>. Furthermore, the 2A-2B strain is an indolacetic acid producer, and a plant inducer of PR1, which is an induced systemic resistance related gene in chili pepper plantlets. Whole genome sequencing was performed to generate a draft genome assembly of 3.953 MB with 46.36% of GC content, and a N50 of 294,737. The genome contains 3713 protein coding genes and 89 RNA genes. Moreover, comparative genome analysis revealed that the 2A-2B strain had the greatest identity (98.4%) with <i>Bacillus velezensis.</i>
Project description:Parasporal crystalline inclusion proteins of some Bacillus spp. are of paramount importance due to their insecticidal, nematocidal, and cancer cell killing capabilities. Here, we present a brief report of the complete genome sequence of Bacillus sp. BD59S, a bacterium that produced HeLa cell-killing parasporal crystalline inclusion proteins. From genome sequencing and assembly, we found that the bacterium has one circular chromosome and two large plasmids, pBTBD59S1 and pBTBD59S2. The size of the chromosome is 5283,933 bp with a 35.4% GC content, consisting of 5938 genes and 5550 protein-coding sequences (CDSs), 25 complete rRNAs (5S, 16S, 23S), 98 tRNAs, 5 ncRNAs, 260 pseudo-genes, and 356 subsystems. Complete plasmid sequence of pBTBD59S1 comprises a total size of 162,149 bp with 33.4% GC content, 192 CDSs, and 13 subsystems. The other plasmid pBTBD59S2, is 199,209 bp long with 32.9% GC content, 179 CDSs, and 11 subsystems. Analyses by NCBI microbial genome BLAST, phylogenetic genome tree, and BLAST ring image generator (BRIG) revealed that BD59S belongs to Bacillus cereus group, and is more close to B. thuringiensis. Further, the strain possesses 57.04 kDa and 54.42 kDa Cry protein-coding genes, which show significant similarities with cancer cell-killing parasporin proteins of B. thuringiensis strains.
Project description:Most tailed bacteriophages (phages) feature linear dsDNA genomes. Characterizing novel phages requires an understanding of complete genome sequences, including the definition of genome physical ends.We sequenced 48 Bacillus cereus phage isolates and analyzed Next-generation sequencing (NGS) data to resolve the genome configuration of these novel phages. Most assembled contigs featured reads that mapped to both contig ends and formed circularized contigs. Independent assemblies of 31 nearly identical I48-like Bacillus phage isolates allowed us to observe that the assembly programs tended to produce random cleavage on circularized contigs. However, currently available assemblers were not capable of reporting the underlying phage genome configuration from sequence data. To identify the genome configuration of sequenced phage in silico, a terminus prediction method was developed by means of 'neighboring coverage ratios' and 'read edge frequencies' from read alignment files. Termini were confirmed by primer walking and supported by phylogenetic inference of large DNA terminase protein sequences.The Terminus package using phage NGS data along with the contig circularity could efficiently identify the proximal positions of phage genome terminus. Complete phage genome sequences allow a proposed characterization of the potential packaging mechanisms and more precise genome annotation.