Microbe-ID: an open source toolbox for microbial genotyping and species identification.
ABSTRACT: Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID (microbe-id.org) and provided a working implementation for the genus Phytophthora (phytophthora-id.org). In Phytophthora-ID, the Sequence-ID application allows identification based on ITS or cox spacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen species P. infestans and P. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided at https://github.com/grunwaldlab/Microbe-ID.
Project description:BACKGROUND: Microsatellites or single sequence repeats (SSRs) are a powerful choice of marker in the study of Phytophthora population biology, epidemiology, ecology, genetics and evolution. A strategy was tested in which the publicly available unigene datasets extracted from genome sequences of P. infestans, P. sojae and P. ramorum were mined for candidate SSR markers that could be applied to a wide range of Phytophthora species. RESULTS: A first approach, aimed at the identification of polymorphic SSR loci common to many Phytophthora species, yielded 171 reliable sequences containing 211 SSRs. Microsatellites were identified from 16 target species representing the breadth of diversity across the genus. Repeat number ranged from 3 to 16 with most having seven repeats or less and four being the most commonly found. Trinucleotide repeats such as (AAG)n, (AGG)n and (AGC)n were the most common followed by pentanucleotide, tetranucleotide and dinucleotide repeats. A second approach was aimed at the identification of useful loci common to a restricted number of species more closely related to P. sojae (P. alni, P. cambivora, P. europaea and P. fragariae). This analysis yielded 10 trinucleotide and 2 tetranucleotide SSRs which were repeated 4, 5 or 6 times. CONCLUSION: Key studies on inter- and intra-specific variation of selected microsatellites remain. Despite the screening of conserved gene coding regions, the sequence diversity between species was high and the identification of useful SSR loci applicable to anything other than the most closely related pairs of Phytophthora species was challenging. That said, many novel SSR loci for species other than the three 'source species' (P. infestans, P. sojae and P. ramorum) are reported, offering great potential for the investigation of Phytophthora populations. In addition to the presence of microsatellites, many of the amplified regions may represent useful molecular marker regions for other studies as they are highly variable and easily amplifiable from different Phytophthora species.
Project description:Here we present genome sequences for twelve isolates of the invasive pathogen Phytophthora ramorum EU1. The assembled genome sequences and raw sequence data are available via BioProject accession number PRJNA177509. These data will be useful in developing molecular tools for specific detection and identification of this pathogen.
Project description:Phytophthora is one of the most important and aggressive plant pathogenic genera in agriculture and forestry. Early detection and identification of its pathways of infection and spread are of high importance to minimize the threat they pose to natural ecosystems. eDNA was extracted from soil and water from forests and plantations in the north of Spain. Phytophthora-specific primers were adapted for use in high-throughput Sequencing (HTS). Primers were tested in a control reaction containing eight Phytophthora species and applied to water and soil eDNA samples from northern Spain. Different score coverage threshold values were tested for optimal Phytophthora species separation in a custom-curated database and in the control reaction. Clustering at 99% was the optimal criteria to separate most of the Phytophthora species. Multiple Molecular Operational Taxonomic Units (MOTUs) corresponding to 36 distinct Phytophthora species were amplified in the environmental samples. Pyrosequencing of amplicons from soil samples revealed low Phytophthora diversity (13 species) in comparison with the 35 species detected in water samples. Thirteen of the MOTUs detected in rivers and streams showed no close match to sequences in international sequence databases, revealing that eDNA pyrosequencing is a useful strategy to assess Phytophthora species diversity in natural ecosystems.
Project description:Coagulase-negative staphylococci (CNS) are the most frequently isolated pathogens from cows with intramammary infection (IMI). Although API STAPH ID 20, a commercially available identification system, and PCR-restriction fragment length polymorphism (PCR-RFLP) of the gap gene (gap PCR-RFLP) have been successfully applied for the identification of CNS isolates from human specimens, their accuracy in the identification of veterinary isolates has not been fully established. In this study, we identified 263 CNS isolates from bovine IMI at species level by partial 16S rRNA gene sequence analysis as the definitive test. Species identification obtained using partial 16S rRNA gene sequence analysis was compared to results from the API STAPH ID 20 and gap PCR-RFLP analysis. Eleven different CNS species were identified by partial 16S rRNA gene sequence analysis. Only 76.0% (200/263) of the species identification results obtained by API STAPH ID 20 matched those obtained by partial 16S rRNA gene sequence analysis, whereas 97.0% (255/263) of the species identification results obtained by the gap PCR-RFLP analysis matched those obtained by partial 16S rRNA gene sequence analysis. The gap PCR-RFLP analysis could be a useful and reliable alternative method for the species identification of CNS isolates from bovine IMI and appears to be a more accurate method of species identification than the API STAPH ID 20 system.
Project description:<h4>Background</h4>Constraint-based modeling is a widely used and powerful methodology to assess the metabolic phenotypes and capabilities of an organism. The starting point and cornerstone of all such modeling is a genome-scale metabolic network reconstruction. The creation, further development, and application of such networks is a growing field of research thanks to a plethora of readily accessible computational tools. While the majority of studies are focused on single-species analyses, typically of a microbe, the computational study of communities of organisms is gaining attention. Similarly, reconstructions that are unified for a multi-cellular organism have gained in popularity. Consequently, the rapid generation of genome-scale metabolic reconstructed networks is crucial. While multiple web-based or stand-alone tools are available for automated network reconstruction, there is, however, currently no publicly available tool that allows the swift assembly of draft reconstructions of community metabolic networks and consolidated metabolic networks for a specified list of organisms.<h4>Results</h4>Here, we present AutoKEGGRec, an automated tool that creates first draft metabolic network reconstructions of single organisms, community reconstructions based on a list of organisms, and finally a consolidated reconstruction for a list of organisms or strains. AutoKEGGRec is developed in Matlab and works seamlessly with the COBRA Toolbox v3, and it is based on only using the KEGG database as external input. The generated first draft reconstructions are stored in SBML files and consist of all reactions for a KEGG organism ID and corresponding linked genes. This provides a comprehensive starting point for further refinement and curation using the host of COBRA toolbox functions or other preferred tools. Through the data structures created, the tool also facilitates a comparative analysis of metabolic content in any given number of organisms present in the KEGG database.<h4>Conclusion</h4>AutoKEGGRec provides a first step in a metabolic network reconstruction process, filling a gap for tools creating community and consolidated metabolic networks. Based only on KEGG data as external input, the generated reconstructions consist of data with a directly traceable foundation and pedigree. With AutoKEGGRec, this kind of modeling is made accessible to a wider part of the genome-scale metabolic analysis community.
Project description:Agricultural high throughput diagnostics need to be fast, accurate and have multiplexing capacity. Metagenomic sequencing is being widely evaluated for plant and animal diagnostics. Bioinformatic analysis of metagenomic sequence data has been a bottleneck for diagnostic analysis due to the size of the data files. Most available tools for analyzing high-throughput sequencing (HTS) data require that the user have computer coding skills and access to high-performance computing. To overcome constraints to most sequencing-based diagnostic pipelines today, we have developed Microbe Finder (MiFi<sup>®</sup>). MiFi<sup>®</sup> is a web application for quick detection and identification of known pathogen species/strains in raw, unassembled HTS metagenomic data. HTS-based diagnostic tools developed through MiFi<sup>®</sup> must pass rigorous validation, which is outlined in this manuscript. MiFi<sup>®</sup> allows researchers to collaborate in the development and validation of HTS-based diagnostic assays using MiProbe™, a platform used for developing pathogen-specific e-probes. Validated e-probes are made available to diagnosticians through MiDetect™. Here we describe the e-probe development, curation and validation process of MiFi<sup>®</sup> using grapevine pathogens as a model system. MiFi<sup>®</sup> can be used with any pathosystem and HTS platform after e-probes have been validated.
Project description:Phytophthora blight caused by Phytophthora cajani is an emerging disease of pigeonpea (Cajanus cajan L.) affecting the crop irrespective of cropping system, cultivar grown and soil types. Current detection and identification methods for Phytophthora species rely primarily on cultural and morphological characteristics, the assessment of which is time-consuming and not always suitable. Sensitive and reliable methods for isolation, identification, zoospore production and estimating infection severity are therefore desirable in case of Phytophthora blight of pigeonpea.In this study, protocols for isolation and identification of Phytophthora blight of pigeonpea were standardized. Also the method for zoospore production and in planta infection of P. cajani was developed. Quantification of fungal colonization by P. cajani using real-time PCR was further standardized. Phytophthora species infecting pigeonpea was identified based on mycological characters such as growth pattern, mycelium structure and sporangial morphology of the isolates and confirmed through molecular characterization (sequence deposited in GenBank). For Phytophthora disease development, zoospore suspension of 1?×?10(5) zoospores per ml was found optimum. Phytophthora specific real-time PCR assay was developed using specific primers based on internal transcribed spacer (ITS) 1 and 2. Use of real-time PCR allowed the quantitative estimation of fungal biomass in plant tissues. Detection sensitivities were within the range of 0.001 pg fungal DNA. A study to see the effect of elevated CO? on Phytophthora blight incidence was also conducted which indicated no significant difference in disease incidence, but incubation period delayed under elevated CO? as compared to ambient level.The zoospore infection method for Phytophthora blight of pigeonpea will facilitate the small and large scale inoculation experiments and thus devise a platform for rapid and reliable screening against Phytophthora blight disease of pigeonpea. qPCR allowed a reliable detection and quantification of P. cajani in samples with low pathogen densities. This can be useful in early warning systems prior to potential devastating outbreak of the disease.
Project description:Identification of specific genotypes can be accomplished by visual recognition of their distinct phenotypical appearance, as well as DNA analysis. Visual identification (ID) of species is subjective and usually requires substantial taxonomic expertise. Genotyping and sequencing are destructive, time- and labor-consuming. In this study, we investigate the potential use of Raman spectroscopy (RS) as a label-free, non-invasive and non-destructive analytical technique for the fast and accurate identification of peanut genotypes. We show that chemometric analysis of peanut leaflet spectra provides accurate identification of different varieties. This same analysis can be used for prediction of nematode resistance and oleic-linoleic oil (O/L) ratio. Raman-based analysis of seeds provides accurate genotype identification in 95% of samples. Additionally, we present data on the identification of carbohydrates, proteins, fiber and other nutrients obtained from spectroscopic signatures of peanut seeds. These results demonstrate that RS allows for fast, accurate and non-invasive screening and selection of plants which can be used for precision breeding.
Project description:The Phytophthora genus comprises of some of the most destructive plant pathogens and attack a wide range of hosts including economically valuable tree species, both angiosperm and gymnosperm. Many known species of Phytophthora are invasive and have been introduced through nursery and agricultural trade. As part of a larger project aimed at utilizing genomic data for forest disease diagnostics, pathogen detection and monitoring (The TAIGA project: Tree Aggressors Identification using Genomic Approaches; http://taigaforesthealth.com/), we sequenced the genomes of six important Phytophthora species that are important invasive pathogens of trees and a serious threat to the international trade of forest products. This genomic data was used to develop highly sensitive and specific detection assays and for genome comparisons and to make evolutionary inferences and will be useful to the broader plant and tree health community. These WGS data have been deposited in the International Nucleotide Sequence Database Collaboration (DDBJ/ENA/GenBank) under the accession numbers AUPN01000000, AUVH01000000, AUWJ02000000, AUUF02000000, AWVV02000000 and AWVW02000000.
Project description:Intriguing and potentially commercially useful microorganisms are found in our surroundings and new tools allow us to learn about their genetic potential and evolutionary history. Engaging students from different disciplines and courses in the search for microbes requires an exciting project with innovative but straightforward procedures and goals. Here we describe an interdisciplinary program to engage students from different courses in the sampling, identification and analysis of the DNA sequences of a unique yet common microbe, Delftia spp. A campus-wide challenge was created to identify the prevalence of this genus, able to precipitate gold, involving introductory level environmental and life science courses, upper-level advanced laboratory modules taken by undergraduate students (juniors and seniors), graduate students and staff from the campus. The number of participants involved allowed for extensive sampling while undergraduate researchers and students in lab-based courses participated in the sample processing and analyses, helping contextualize and solidify their learning of the molecular biology techniques. The results were shared at each step through publicly accessible websites and workshops. This model allows for the rapid discovery of Delftia presence and prevalence and is adaptable to different campuses and experimental questions.