Project description:A wide variety of terrestrial ecosystems in tundra have a ground vegetation cover composed of reindeer lichens (genera Cladonia and Cetraria). The microbial communities of two lichen-dominated ecosystems typical of the sub-arctic zone of northwestern Siberia, that is a forested tundra soil and a shallow acidic peatland, were examined in our study. As revealed by molecular analyses, soil and peat layers just beneath the lichen cover were abundantly colonized by bacteria from the phylum Planctomycetes. Highest abundance of planctomycetes detected by fluorescence in situ hybridization was in the range 2.2-2.7 × 107 cells per gram of wet weight. 16S rRNA gene fragments from the Planctomycetes comprised 8-13% of total 16S rRNA gene reads retrieved using Illumina pair-end sequencing from the soil and peat samples. Lichen-associated assemblages of planctomycetes displayed unexpectedly high diversity, with a total of 89,662 reads representing 1723 operational taxonomic units determined at 97% sequence identity. The soil of forested tundra was dominated by uncultivated members of the family Planctomycetaceae (53-71% of total Planctomycetes-like reads), while sequences affiliated with the Phycisphaera-related group WD2101 (recently assigned to the order Tepidisphaerales) were most abundant in peat (28-51% of total reads). Representatives of the Isosphaera-Singulisphaera group (14-28% of total reads) and the lineages defined by the genera Gemmata (1-4%) and Planctopirus-Rubinisphaera (1-3%) were present in both habitats. Two strains of Singulisphaera-like bacteria were isolated from studied soil and peat samples. These planctomycetes displayed good tolerance of low temperatures (4-15°C) and were capable of growth on a number of polysaccharides, including lichenan, a characteristic component of lichen-derived phytomass.
Project description:Genotyping of RpoD mutants via amplicon sequencing from the following manuscript: \\"Systematic dissection of σ70 sequence diversity and function in bacteria\\" by Park and Wang (2020). Includes raw sequencing reads from samples from MAGE-seq single codon saturation mutagenesis and high-throughput fitness competition experiment as well as the RpoD ortholog mutants generated through recombineering and CRISPR selection.
Project description:Soil metagenomics has been touted as the "grand challenge" for metagenomics, as the high microbial diversity and spatial heterogeneity of soils make them unamenable to current assembly platforms. Here, we aimed to improve soil metagenomic sequence assembly by applying the Moleculo synthetic long-read sequencing technology. In total, we obtained 267 Gbp of raw sequence data from a native prairie soil; these data included 109.7 Gbp of short-read data (~100 bp) from the Joint Genome Institute (JGI), an additional 87.7 Gbp of rapid-mode read data (~250 bp), plus 69.6 Gbp (>1.5 kbp) from Moleculo sequencing. The Moleculo data alone yielded over 5,600 reads of >10 kbp in length, and over 95% of the unassembled reads mapped to contigs of >1.5 kbp. Hybrid assembly of all data resulted in more than 10,000 contigs over 10 kbp in length. We mapped three replicate metatranscriptomes derived from the same parent soil to the Moleculo subassembly and found that 95% of the predicted genes, based on their assignments to Enzyme Commission (EC) numbers, were expressed. The Moleculo subassembly also enabled binning of >100 microbial genome bins. We obtained via direct binning the first complete genome, that of "<i>Candidatus</i> Pseudomonas sp. strain JKJ-1" from a native soil metagenome. By mapping metatranscriptome sequence reads back to the bins, we found that several bins corresponding to low-relative-abundance <i>Acidobacteria</i> were highly transcriptionally active, whereas bins corresponding to high-relative-abundance <i>Verrucomicrobia</i> were not. These results demonstrate that Moleculo sequencing provides a significant advance for resolving complex soil microbial communities. <b>IMPORTANCE</b> Soil microorganisms carry out key processes for life on our planet, including cycling of carbon and other nutrients and supporting growth of plants. However, there is poor molecular-level understanding of their functional roles in ecosystem stability and responses to environmental perturbations. This knowledge gap is largely due to the difficulty in culturing the majority of soil microbes. Thus, use of culture-independent approaches, such as metagenomics, promises the direct assessment of the functional potential of soil microbiomes. Soil is, however, a challenge for metagenomic assembly due to its high microbial diversity and variable evenness, resulting in low coverage and uneven sampling of microbial genomes. Despite increasingly large soil metagenome data volumes (>200 Gbp), the majority of the data do not assemble. Here, we used the cutting-edge approach of synthetic long-read sequencing technology (Moleculo) to assemble soil metagenome sequence data into long contigs and used the assemblies for binning of genomes. <b>Author Video</b>: An author video summary of this article is available.
Project description:Genotyping of RpoD mutants via amplicon sequencing from the following manuscript: \"Systematic dissection of σ70 sequence diversity and function in bacteria\" by Park and Wang (2020). Includes raw sequencing reads from samples from MAGE-seq single codon saturation mutagenesis and high-throughput fitness competition experiment as well as the RpoD ortholog mutants generated through recombineering and CRISPR selection.
Project description:High-throughput, culture-independent surveys of bacterial and archaeal communities in soil have illuminated the importance of both edaphic and biotic influences on microbial diversity, yet few studies compare the relative importance of these factors. Here, we employ multiplexed pyrosequencing of the 16S rRNA gene to examine soil- and cactus-associated rhizosphere microbial communities of the Sonoran Desert and the artificial desert biome of the Biosphere2 research facility. The results of our replicate sampling approach show that microbial communities are shaped primarily by soil characteristics associated with geographic locations, while rhizosphere associations are secondary factors. We found little difference between rhizosphere communities of the ecologically similar saguaro (Carnegiea gigantea) and cardón (Pachycereus pringlei) cacti. Both rhizosphere and soil communities were dominated by the disproportionately abundant Crenarchaeota class Thermoprotei, which comprised 18.7% of 183,320 total pyrosequencing reads from a comparatively small number (1,337 or 3.7%) of the 36,162 total operational taxonomic units (OTUs). OTUs common to both soil and rhizosphere samples comprised the bulk of raw sequence reads, suggesting that the shared community of soil and rhizosphere microbes constitute common and abundant taxa, particularly in the bacterial phyla Proteobacteria, Actinobacteria, Planctomycetes, Firmicutes, Bacteroidetes, Chloroflexi, and Acidobacteria. The vast majority of OTUs, however, were rare and unique to either soil or rhizosphere communities and differed among locations dozens of kilometers apart. Several soil properties, particularly soil pH and carbon content, were significantly correlated with community diversity measurements. Our results highlight the importance of culture-independent approaches in surveying microbial communities of extreme environments.
Project description:The alignment of DNA sequences to proteins, allowing for frameshifts, is a classic method in sequence analysis. It can help identify pseudogenes (which accumulate mutations), analyze raw DNA and RNA sequence data (which may have frameshift sequencing errors), investigate ribosomal frameshifts, etc. Often, however, only ad hoc approximations or simulations are available to provide the statistical significance of a frameshift alignment score.We describe a method to estimate statistical significance of frameshift alignments, similar to classic BLAST statistics. (BLAST presently does not permit its alignments to include frameshifts.) We also illustrate the continuing usefulness of frameshift alignment with two 'post-genomic' applications: (i) when finding pseudogenes within the human genome, frameshift alignments show that most anciently conserved non-coding human elements are recent pseudogenes with conserved ancestral genes; and (ii) when analyzing metagenomic DNA reads from polluted soil, frameshift alignments show that most alignable metagenomic reads contain frameshifts, suggesting that metagenomic analysis needs to use frameshift alignment to derive accurate results.
Project description:Soil bacteria can be a valuable source of antimicrobial compounds. Here, we report the complete genomes of four soil bacteria that were isolated by undergraduate microbiology students as part of a course-based research experience. These genomes were assembled using a hybrid approach combining paired-end Illumina reads with Oxford Nanopore Technologies MinION reads.
Project description:The soil ecosystem is critical for human health, affecting aspects of the environment from key agricultural and edaphic parameters to critical influence on climate change. Soil has more unknown biodiversity than any other ecosystem. We have applied diverse DNA extraction methods coupled with high throughput pyrosequencing to explore 4.88 × 10(9) bp of metagenomic sequence data from the longest continually studied soil environment (Park Grass experiment at Rothamsted Research in the UK). Results emphasize important DNA extraction biases and unexpectedly low seasonal and vertical soil metagenomic functional class variations. Clustering-based subsystems and carbohydrate metabolism had the largest quantity of annotated reads assigned although <50% of reads were assigned at an E value cutoff of 10(-5). In addition, with the more detailed subsystems, cAMP signaling in bacteria (3.24±0.27% of the annotated reads) and the Ton and Tol transport systems (1.69±0.11%) were relatively highly represented. The most highly represented genome from the database was that for a Bradyrhizobium species. The metagenomic variance created by integrating natural and methodological fluctuations represents a global picture of the Rothamsted soil metagenome that can be used for specific questions and future inter-environmental metagenomic comparisons. However, only 1% of annotated sequences correspond to already sequenced genomes at 96% similarity and E values of <10(-5), thus, considerable genomic reconstructions efforts still have to be performed.
Project description:Microorganisms are useful environmental indicators, able to deliver essential insights to processes regarding mine land rehabilitation. To compare microbial communities from a chronosequence of mine land rehabilitation to pre-disturbance levels from references sites covered by native vegetation, we sampled non-rehabilitated, rehabilitating and reference study sites from the Urucum Massif, Southwestern Brazil. From each study site, three composed soil samples were collected for chemical, physical, and metagenomics analysis. We used a paired-end library sequencing technology (NextSeq 500 Illumina); the reads were assembled using MEGAHIT. Coding DNA sequences (CDS) were identified using Kaiju in combination with non-redundant NCBI BLAST reference sequences containing archaea, bacteria, and viruses. Additionally, a functional classification was performed by EMG v2.3.2. Here, we provide the raw data and assembly (reads and contigs), followed by initial functional and taxonomic analysis, as a base-line for further studies of this kind. Further investigation is needed to fully understand the mechanisms of environmental rehabilitation in tropical regions, inspiring further researchers to explore this collection for hypothesis testing.
Project description:The Oxford Nanopore MinION is an affordable and portable DNA sequencer that can produce very long reads (tens of kilobase pairs), which enable de novo bacterial genome assembly. Although many algorithms and tools have been developed for base calling, read mapping, de novo assembly, and polishing, an automated pipeline is not available for one-stop analysis for circular bacterial genome reconstruction. In this paper, we present the pipeline CCBGpipe for completing circular bacterial genomes. Raw current signals are demultiplexed and base called to generate sequencing data. Sequencing reads are de novo assembled several times by using a sampling strategy to produce circular contigs that have a sequence in common between their start and end. The circular contigs are polished by using raw signals and sequencing reads; then, duplicated sequences are removed to form a linear representation of circular sequences. The circularized contigs are finally rearranged to start at the start position of dnaA/repA or a replication origin based on the GC skew. CCBGpipe implemented in Python is available at https://github.com/jade-nhri/CCBGpipe. Using sequencing data produced from a single MinION run, we obtained 48 circular sequences, comprising 12 chromosomes and 36 plasmids of 12 bacteria, including Acinetobacter nosocomialis, Acinetobacter pittii, and Staphylococcus aureus. With adequate quantities of sequencing reads (80×), CCBGpipe can provide a complete and automated assembly of circular bacterial genomes.