Unknown

Dataset Information

0

Using strain-resolved analysis to identify contamination in metagenomics data.


ABSTRACT:

Background

Metagenomics analyses can be negatively impacted by DNA contamination. While external sources of contamination such as DNA extraction kits have been widely reported and investigated, contamination originating within the study itself remains underreported.

Results

Here, we applied high-resolution strain-resolved analyses to identify contamination in two large-scale clinical metagenomics datasets. By mapping strain sharing to DNA extraction plates, we identified well-to-well contamination in both negative controls and biological samples in one dataset. Such contamination is more likely to occur among samples that are on the same or adjacent columns or rows of the extraction plate than samples that are far apart. Our strain-resolved workflow also reveals the presence of externally derived contamination, primarily in the other dataset. Overall, in both datasets, contamination is more significant in samples with lower biomass.

Conclusion

Our work demonstrates that genome-resolved strain tracking, with its essentially genome-wide nucleotide-level resolution, can be used to detect contamination in sequencing-based microbiome studies. Our results underscore the value of strain-specific methods to detect contamination and the critical importance of looking for contamination beyond negative and positive controls. Video Abstract.

SUBMITTER: Lou YC 

PROVIDER: S-EPMC9979413 | biostudies-literature | 2023 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Using strain-resolved analysis to identify contamination in metagenomics data.

Lou Yue Clare YC   Hoff Jordan J   Olm Matthew R MR   West-Roberts Jacob J   Diamond Spencer S   Firek Brian A BA   Morowitz Michael J MJ   Banfield Jillian F JF  

Microbiome 20230302 1


<h4>Background</h4>Metagenomics analyses can be negatively impacted by DNA contamination. While external sources of contamination such as DNA extraction kits have been widely reported and investigated, contamination originating within the study itself remains underreported.<h4>Results</h4>Here, we applied high-resolution strain-resolved analyses to identify contamination in two large-scale clinical metagenomics datasets. By mapping strain sharing to DNA extraction plates, we identified well-to-w  ...[more]

Similar Datasets

| S-EPMC9600693 | biostudies-literature
| S-EPMC11869744 | biostudies-literature
| S-EPMC11275843 | biostudies-literature
| S-EPMC6472834 | biostudies-literature
| S-EPMC10785762 | biostudies-literature
| S-EPMC5826242 | biostudies-literature
| S-EPMC5870649 | biostudies-literature
| S-EPMC5822648 | biostudies-literature
| S-EPMC8507239 | biostudies-literature
| S-EPMC9426564 | biostudies-literature