Project description:The goal of the Human Microbiome Project (HMP) is to generate a comprehensive catalog of human-associated microorganisms including reference genomes representing the most common species. Toward this goal, the HMP has characterized the microbial communities at 18 body habitats in a cohort of over 200 healthy volunteers using 16S rRNA gene (16S) sequencing and has generated nearly 1,000 reference genomes from human-associated microorganisms. To determine how well current reference genome collections capture the diversity observed among the healthy microbiome and to guide isolation and future sequencing of microbiome members, we compared the HMP's 16S data sets to several reference 16S collections to create a 'most wanted' list of taxa for sequencing. Our analysis revealed that the diversity of commonly occurring taxa within the HMP cohort microbiome is relatively modest, few novel taxa are represented by these OTUs and many common taxa among HMP volunteers recur across different populations of healthy humans. Taken together, these results suggest that it should be possible to perform whole-genome sequencing on a large fraction of the human microbiome, including the 'most wanted', and that these sequences should serve to support microbiome studies across multiple cohorts. Also, in stark contrast to other taxa, the 'most wanted' organisms are poorly represented among culture collections suggesting that novel culture- and single-cell-based methods will be required to isolate these organisms for sequencing.
Project description:The Human Microbiome Project (HMP) aims to characterize the microbial communities of 18 body sites from healthy individuals. To accomplish this, the HMP generated two types of shotgun data: reference shotgun sequences isolated from different anatomical sites on the human body and shotgun metagenomic sequences from the microbial communities of each site. The alignment strategy for characterizing these metagenomic communities using available reference sequence is important to the success of HMP data analysis. Six next-generation aligners were used to align a community of known composition against a database comprising reference organisms known to be present in that community. All aligners report nearly complete genome coverage (>97%) for strains with over 6X depth of coverage, however they differ in speed, memory requirement and ease of use issues such as database size limitations and supported mapping strategies. The selected aligner was tested across a range of parameters to maximize sensitivity while maintaining a low false positive rate. We found that constraining alignment length had more impact on sensitivity than does constraining similarity in all cases tested. However, when reference species were replaced with phylogenetic neighbors, similarity begins to play a larger role in detection. We also show that choosing the top hit randomly when multiple, equally strong mappings are available increases overall sensitivity at the expense of taxonomic resolution. The results of this study identified a strategy that was used to map over 3 tera-bases of microbial sequence against a database of more than 5,000 reference genomes in just over a month.
Project description:Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.
Project description:The genus Enterobacter comprises common pathogens and has a complicated taxonomy. Precise taxonomic assignation lays a foundation for microbiology. In this study, we updated the Enterobacter taxonomy based on robust genome analyses. We found that all Enterobacter subspecies assignments were incorrect. Enterobacter cloacae subsp. dissolvens and Enterobacter hormaechei subsp. hoffmannii are species (Enterobacter dissolvens and Enterobacter hoffmannii, respectively) rather than subspecies. Enterobacter xiangfangensis, Enterobacter hormaechei subsp. oharae, and Enterobacter hormaechei subsp. steigerwaltii are not Enterobacter hormaechei subspecies but belong to the same species (Enterobacter xiangfangensis). Enterobacter timonensis should be removed to Pseudenterobacter, a novel genus. We then reported two novel species, Enterobacter quasiroggenkampii and Enterobacter quasimori, by genome- and phenotype-based characterization. We also applied the updated taxonomy to curate 1,997 Enterobacter genomes in GenBank. Species identification was changed following our updated taxonomy for the majority of publicly available strains (1,542, 77.2%). The most common Enterobacter species was E. xiangfangensis We identified 14 novel tentative Enterobacter genomospecies. This study highlights that updated and curated taxonomic assignments are the premise of correct identification.IMPORTANCE Enterobacter species are major human pathogens. Precise species identification lays a foundation for microbiology, but the taxonomy of Enterobacter is complicated and confusing. In this study, first, we significantly updated the taxonomy of Enterobacter by rigorous genome analyses and found that all subspecies assignments of Enterobacter were incorrect. Second, we characterized and reported two novel Enterobacter species with clinical significance. Third, we curated 1,997 Enterobacter genome sequences deposited in GenBank and found that the species identification of most Enterobacter strains needed to be corrected. Fourth, we found that the most common Enterobacter species seen in clinical samples is Enterobacter xiangfangensis rather than Enterobacter cloacae Fifth, we identified 14 tentative novel Enterobacter and 18 tentative novel non-Enterobacter species. This study highlights that updated and curated taxonomic assignments are the premise of correct species identification. We recommend that future Enterobacter studies need to use the updated taxonomy to avoid misleading information.
Project description:Understanding gut microbiome functions requires cultivated bacteria for experimental validation and reference bacterial genome sequences to interpret metagenome datasets and guide functional analyses. We present the Human Gastrointestinal Bacteria Culture Collection (HBC), a comprehensive set of 737 whole-genome-sequenced bacterial isolates, representing 273 species (105 novel species) from 31 families found in the human gastrointestinal microbiota. The HBC increases the number of bacterial genomes derived from human gastrointestinal microbiota by 37%. The resulting global Human Gastrointestinal Bacteria Genome Collection (HGG) classifies 83% of genera by abundance across 13,490 shotgun-sequenced metagenomic samples, improves taxonomic classification by 61% compared to the Human Microbiome Project (HMP) genome collection and achieves subspecies-level classification for almost 50% of sequences. The improved resource of gastrointestinal bacterial reference sequences circumvents dependence on de novo assembly of metagenomes and enables accurate and cost-effective shotgun metagenomic analyses of human gastrointestinal microbiota.