Shotgun metagenomic sequencing based microbial diversity assessment of Lasundra hot spring, India.
ABSTRACT: This is the first report on the metagenomic approach for unveiling the microbial diversity of Lasundra hot spring, Gujarat State, India. High-throughput sequencing of community DNA was performed on an Ion Torrent PGM platform. Metagenome consisted of 606,867 sequences represent 98,567,305 bps size with an average length of 162 bps and 46% G + C content. Metagenome sequence information is available at EBI under EBI Metagenomic database with accession no. ERP009313. MG-RAST assisted community analysis revealed that 99.21% sequences were bacterial origin, 0.43% was fit to eukaryotes and 0.11% belongs to archaea. A total of 29 bacterial, 20 eukaryotic and 4 archaeal phyla were detected. Abundant genera were Bacillus (86.7%), Geobacillus (2.4%), Paenibacillus (1.0%), Clostridium (0.7%) and Listeria (0.5%), that represent 91.52% in metagenome. In functional analysis, Cluster of Orthologous Group (COG) based annotation revealed that 45.4% was metabolism connected and 19.6% falls in poorly characterized group. Subsystem based annotation approach suggests that the 14.0% was carbohydrates, 7.0% was protein metabolism and 3.0% genes for various stress responses together with the versatile presence of commercially useful traits.
Project description:We present the metagenomic dataset of the microbial DNA of a termite mound in the North West Province of South Africa. This is the foremost account revealing the microbial diversity of a termite mound soil using the shotgun metagenomics approach in the Province. Next-generation sequencing of the community DNA was carried out on an Illumina Miseq platform. The metagenome comprised of 7,270,818 sequences representing 1,172,099,467 bps with a mean length of 161 bps and 52% G + C content. The sequence data is accessible at the NCBI SRA under the bioproject number PRJNA526912. Metagenomic Rapid Annotations using Subsystems Technology (MG-RAST) was employed for community analysis and it was observed that 0.36% sequences were of archeal origin, 9.51% were eukaryotes and 90.01% were fit to bacteria. A total of 5 archeal, 27 bacterial, and 22 eukaryotic phyla were revealed. Abundant genera were Sphingomonas (6.00%), Streptomyces (5.00%), Sphingobium (4.00%), Sphingopyxis (3.00%), and Mycobacterium (3.00%), representing 19.23% in the metagenome. For functional examination, Cluster-of-Orthologous-Group (COG) based annotation showed that 46.44% sequences were metabolism associated and 17.45% grouped in the poorly characterized category. Subsystem based annotation method indicated that 14.00% sequences were carbohydrates, 13.00% were clustering-based subsystems, and 10.00% genes for amino acids and derivatives together with the presence of useful traits needed in the body of science.
Project description:This dataset presents shotgun metagenomic sequencing of sunflower rhizosphere microbiome in Bloemhof, South Africa. Data were collected to decipher the structure and function in the sunflower microbial community. Illumina HiSeq platform using next generation sequencing of the DNA was carried out. The metagenome comprised 8,991,566 sequences totaling 1,607,022,279?bp size and 66% GC content. The metagenome was deposited into the NCBI database and can be accessed with the SRA accession number SRR10418054. An online metagenome server (MG RAST) using the subsystem database revealed bacteria had the highest taxonomical representation with 98.47%, eukaryote at 1.23%, and archaea at 0.20%. The most abundant genera were the Conexibacter (17%), Nocardioides (8%), Streptomyces (7%), Geodermatophilus (6%), Methylobacterium (5%), and Burkholderia (4%). MG-RAST assisted analysis also revealed functional annotation based on subsystem, carbohydrates sequence had 13.74%, clustering based subsystem 12.93%, amino acids and derivatives 10.30% coupled with other useful functional traits needed for plant growth and health.
Project description:A taxonomic description of bacteria was deduced from 5.78 Mb metagenomic sequence retrieved from Tulsi Shyam hot spring, India using bacterial tag-encoded FLX amplicon pyrosequencing (bTEFAP). Metagenome contained 10,893 16S rDNA sequences that were analyzed by MG-RAST server to generate the comprehensive profile of bacteria. Metagenomic data are available at EBI under EBI Metagenomics database with accession no. ERP009559. Metagenome sequences represented the 98.2% bacteria origin, 1.5% of eukaryotic and 0.3% were unidentified. A total of 16 bacterial phyla demonstrating 97 families and 287 species were revealed in the hot spring metagenome. Most abundant phyla were Firmicutes (65.38%), Proteobacteria (21.21%) and unclassified bacteria (10.69%). Whereas, Peptostreptococcaceae (37.33%), Clostridiaceae (23.36%), and Enterobacteriaceae (16.37%) were highest reported families in metagenome. Ubiquitous species were Clostridium bifermentans (17.47%), Clostridium lituseburense (13.93%) and uncultured bacterium (10.15%). Our data provide new information on hot spring bacteria and shed light on their abundance, diversity, distribution and coexisting organisms.
Project description:This dataset represents the root endophytic microbial community profile of maize (Zea mays L.), one of the largest food crops in South Africa, using a shotgun metagenomic approach. To the best of our understanding, this is the first account showcasing the endophytic microbial diversity in maize plants via the shotgun metagenomics approach. High throughput sequencing of the whole DNA from the community was carried out using NovaSeq 6000 system (Illumina). The data obtained consists of 10,915,268 sequences accounting for 261,906,948?bps with an average length of 153 base pairs and 43% Guanine+Cytosine content. The metagenome data can be accessed at the National Centre for Biotechnology Information SRA registered with the accession number PRJNA607664. Community analysis was done using an online server called MG-RAST, which showed that 0.12% of the sequences were archaeal associated, eukaryotes were 15.06%, while 84.77% were classified as bacteria. A sum of 28 bacterial, 22 eukaryotic and 4 archaeal phyla were identified. The predominant genera were Bacillus (16%), Chitinophaga (12%), Flavobacterium (4%), Chryseobacterium (4%), Paenibacillus (4%), Pedobacter (3%) and Alphaproteobacteria (3%). Annotation using Cluster of Orthologous Group (COG) revealed that 41.47% of the sequenced data were for metabolic function, 24.10% for chemical process and signaling, while 17.43% of the sequences were in the poorly characterized group. Annotation using the subsystem method showed that 18% of the sequences were associated with carbohydrates, 9% were for clustering-based subsystems, and 9% contain genes coding for amino acids and derivatives, which might be beneficial in plant growth and health improvement.
Project description:Kimchi, a traditional food in the Korean culture, is made from vegetables by fermentation. In this study, metagenomic approaches were used to monitor changes in bacterial populations, metabolic potential, and overall genetic features of the microbial community during the 29-day fermentation process. Metagenomic DNA was extracted from kimchi samples obtained periodically and was sequenced using a 454 GS FLX Titanium system, which yielded a total of 701,556 reads, with an average read length of 438 bp. Phylogenetic analysis based on 16S rRNA genes from the metagenome indicated that the kimchi microbiome was dominated by members of three genera: Leuconostoc, Lactobacillus, and Weissella. Assignment of metagenomic sequences to SEED categories of the Metagenome Rapid Annotation using Subsystem Technology (MG-RAST) server revealed a genetic profile characteristic of heterotrophic lactic acid fermentation of carbohydrates, which was supported by the detection of mannitol, lactate, acetate, and ethanol as fermentation products. When the metagenomic reads were mapped onto the database of completed genomes, the Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293 and Lactobacillus sakei subsp. sakei 23K genomes were highly represented. These same two genera were confirmed to be important in kimchi fermentation when the majority of kimchi metagenomic sequences showed very high identity to Leuconostoc mesenteroides and Lactobacillus genes. Besides microbial genome sequences, a surprisingly large number of phage DNA sequences were identified from the cellular fractions, possibly indicating that a high proportion of cells were infected by bacteriophages during fermentation. Overall, these results provide insights into the kimchi microbial community and also shed light on fermentation processes carried out broadly by complex microbial communities.
Project description:An understanding of microbial community structure is an important issue in the field of molecular ecology. The traditional molecular method involves amplification of small subunit ribosomal RNA (SSU rRNA) genes by polymerase chain reaction (PCR). However, PCR-based amplicon approaches are affected by primer bias and chimeras. With the development of high-throughput sequencing technology, unbiased SSU rRNA gene sequences can be mined from shotgun sequencing-based metagenomic or metatranscriptomic datasets to obtain a reflection of the microbial community structure in specific types of environment and to evaluate SSU primers. However, the use of short reads obtained through next-generation sequencing for primer evaluation has not been well resolved. The software MIPE (MIcrobiota metagenome Primer Explorer) was developed to adapt numerous short reads from metagenomes and metatranscriptomes. Using metagenomic or metatranscriptomic datasets as input, MIPE extracts and aligns rRNA to reveal detailed information on microbial composition and evaluate SSU rRNA primers. A mock dataset, a real Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST) test dataset, two PrimerProspector test datasets and a real metatranscriptomic dataset were used to validate MIPE. The software calls Mothur (v1.33.3) and the SILVA database (v119) for the alignment and classification of rRNA genes from a metagenome or metatranscriptome. MIPE can effectively extract shotgun rRNA reads from a metagenome or metatranscriptome and is capable of classifying these sequences and exhibiting sensitivity to different SSU rRNA PCR primers. Therefore, MIPE can be used to guide primer design for specific environmental samples.
Project description:We report Metagenome from the saline desert soil sample of Little Rann of Kutch, Gujarat State, India. Metagenome consisted of 633,760 sequences with size 141,307,202 bp and 56% G + C content. Metagenome sequence data are available at EBI under EBI Metagenomics database with accession no. ERP005612. Community metagenomics revealed total 1802 species belonged to 43 different phyla with dominating Marinobacter (48.7%) and Halobacterium (4.6%) genus in bacterial and archaeal domain respectively. Remarkably, 18.2% sequences in a poorly characterized group and 4% gene for various stress responses along with versatile presence of commercial enzyme were evident in a functional metagenome analysis.
Project description:In Puerto Rico, the microbial diversity of the thermal spring (ThS) in Coamo has never been studied using metagenomics. The focus of our research was to generate a metagenomic library from the ThS of Coamo, Puerto Rico and explore the microbial and functional diversity. The metagenomic library from the ThS waters was generated using direct DNA isolation. High molecular weight (40 kbp) DNA was end-repaired, electro eluted and ligated into a fosmid vector (pCCFOS1); then transduced into Escherichia coli EPI300-T1R using T1 bacteriophages. The library consisted of approximately 6000 clones, 90% containing metagenomic DNA. Next-Generation-Sequencing technology (Illumina MiSeq) was used to process the ThS metagenome. After removing the cloning vector, 122,026 sequences with 33.10 Mbps size and 64% of G + C content were annotated and analyzed using the MG-RAST online server. Bacteria showed to be the most abundant domain (95.84%) followed by unidentified sequences (2.28%), viruses (1.67%), eukaryotes (0.15%), and archaea (0.01%). The most abundant phyla were Proteobacteria (95.03%), followed by unidentified (2.28%), unclassified from viruses (1.74%), Firmicutes (0.20%) and Actinobacteria (0.18%). The most abundant species were Escherichia coli, Polaromonas naphthalenivorans, Albidiferax ferrireducens and Acidovorax sp. Subsystem functional analysis showed that 20% of genes belong to transposable elements, 10% to clustering-based subsystems, and 8% to the production of cofactors. Functional analysis using NOG annotation showed that 82.79% of proteins are poorly characterized indicating the possibility of novel microbial functions and with potential biomedical and biotechnological applications. Metagenomic data was deposited into the NCBI database under the accession number SAMN06131862.
Project description:Motivation:Functional and taxonomic analyses are critical steps in understanding interspecific interactions within microbial communities. Currently, such analyses are run separately, which complicates interpretation of results. Here we present the ASAR interactive tool for simultaneous analysis of metagenomic data in three dimensions: taxonomy, function, metagenome. Results:An interactive data analysis tool for selection, aggregation and visualization of metagenomic data is presented. Functional analysis with a SEED hierarchy and pathway diagram based on KEGG orthology based upon MG-RAST annotation results is available. Availability and implementation:Source code of the ASAR is accessible at GitHub (https://github.com/Askarbek-orakov/ASAR). Contact:email@example.com or firstname.lastname@example.org.
Project description:The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.