Unknown

Dataset Information

0

A critical assessment of gene catalogs for metagenomic analysis.


ABSTRACT:

Motivation

Microbial gene catalogs are data structures that organize genes found in microbial communities, providing a reference for standardized analysis of the microbes across samples and studies. Although gene catalogs are commonly used, they have not been critically evaluated for their effectiveness as a basis for metagenomic analyses.

Results

As a case study, we investigate one such catalog, the Integrated Gene Catalog (IGC), however our observations apply broadly to most gene catalogs constructed to date. We focus on both the approach used to construct this catalog and, on its effectiveness, when used as a reference for microbiome studies. Our results highlight important limitations of the approach used to construct the IGC and call into question the broad usefulness of gene catalogs more generally. We also recommend best practices for the construction and use of gene catalogs in microbiome studies and highlight opportunities for future research.

Availability

All supporting scripts for our analyses can be found on GitHub: https://github.com/SethCommichaux/IGC.git. The supporting data can be downloaded from: https://obj.umiacs.umd.edu/igc-analysis/IGC_analysis_data.tar.gz.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Commichaux S 

PROVIDER: S-EPMC8479683 | biostudies-literature | 2021 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

A critical assessment of gene catalogs for metagenomic analysis.

Commichaux Seth S   Shah Nidhi N   Ghurye Jay J   Stoppel Alexander A   Goodheart Jessica A JA   Luque Guillermo G GG   Cummings Michael P MP   Pop Mihai M  

Bioinformatics (Oxford, England) 20210901 18


<h4>Motivation</h4>Microbial gene catalogs are data structures that organize genes found in microbial communities, providing a reference for standardized analysis of the microbes across samples and studies. Although gene catalogs are commonly used, they have not been critically evaluated for their effectiveness as a basis for metagenomic analyses.<h4>Results</h4>As a case study, we investigate one such catalog, the Integrated Gene Catalog (IGC), however, our observations apply broadly to most ge  ...[more]

Similar Datasets

| S-EPMC6446249 | biostudies-literature
| S-EPMC7506068 | biostudies-literature
| PRJEB31567 | ENA
| S-EPMC10714869 | biostudies-literature
| S-EPMC526221 | biostudies-literature
| S-EPMC9645250 | biostudies-literature
| S-EPMC4223665 | biostudies-literature
| S-EPMC4715287 | biostudies-literature
| S-EPMC9784204 | biostudies-literature
2009-12-01 | GSE14276 | GEO