Unknown

Dataset Information

0

Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes.


ABSTRACT: Increasingly available microbial reference data allow interpreting the composition and function of previously uncharacterized microbial communities in detail, via high-throughput sequencing analysis. However, efficient methods for read classification are required when the best database matches for short sequence reads are often shared among multiple reference sequences. Here, we take advantage of the fact that microbial sequences can be annotated relative to established tree structures, and we develop a highly scalable read classifier, PRROMenade, by enhancing the generalized Burrows-Wheeler transform with a labeling step to directly assign reads to the corresponding lowest taxonomic unit in an annotation tree. PRROMenade solves the multi-matching problem while allowing fast variable-size sequence classification for phylogenetic or functional annotation. Our simulations with 5% added differences from reference indicated only 1.5% error rate for PRROMenade functional classification. On metatranscriptomic data PRROMenade highlighted biologically relevant functional pathways related to diet-induced changes in the human gut microbiome.

SUBMITTER: Utro F 

PROVIDER: S-EPMC7125348 | biostudies-literature | 2020 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes.

Utro Filippo F   Haiminen Niina N   Siragusa Enrico E   Gardiner Laura-Jayne LJ   Seabolt Ed E   Krishna Ritesh R   Kaufman James H JH   Parida Laxmi L  

iScience 20200317 4


Increasingly available microbial reference data allow interpreting the composition and function of previously uncharacterized microbial communities in detail, via high-throughput sequencing analysis. However, efficient methods for read classification are required when the best database matches for short sequence reads are often shared among multiple reference sequences. Here, we take advantage of the fact that microbial sequences can be annotated relative to established tree structures, and we d  ...[more]

Similar Datasets

| S-EPMC7924554 | biostudies-literature
| S-EPMC2696921 | biostudies-literature
| S-EPMC8330939 | biostudies-literature
| S-EPMC5732329 | biostudies-literature
2019-06-20 | GSE133071 | GEO
2019-05-02 | GSE130537 | GEO
2019-05-02 | GSE123578 | GEO
2019-05-02 | GSE123577 | GEO
2019-05-02 | GSE123580 | GEO