Unknown

Dataset Information

0

AMAISE: a machine learning approach to index-free sequence enrichment.


ABSTRACT: Metagenomics holds potential to improve clinical diagnostics of infectious diseases, but DNA from clinical specimens is often dominated by host-derived sequences. To address this, researchers employ host-depletion methods. Laboratory-based host-depletion methods, however, are costly in terms of time and effort, while computational host-depletion methods rely on memory-intensive reference index databases and struggle to accurately classify noisy sequence data. To solve these challenges, we propose an index-free tool, AMAISE (A Machine Learning Approach to Index-Free Sequence Enrichment). Applied to the task of separating host from microbial reads, AMAISE achieves over 98% accuracy. Applied prior to metagenomic classification, AMAISE results in a 14-18% decrease in memory usage compared to using metagenomic classification alone. Our results show that a reference-independent machine learning approach to host depletion allows for accurate and efficient sequence detection.

SUBMITTER: Krishnamoorthy M 

PROVIDER: S-EPMC9184628 | biostudies-literature | 2022 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

AMAISE: a machine learning approach to index-free sequence enrichment.

Krishnamoorthy Meera M   Ranjan Piyush P   Erb-Downward John R JR   Dickson Robert P RP   Wiens Jenna J  

Communications biology 20220609 1


Metagenomics holds potential to improve clinical diagnostics of infectious diseases, but DNA from clinical specimens is often dominated by host-derived sequences. To address this, researchers employ host-depletion methods. Laboratory-based host-depletion methods, however, are costly in terms of time and effort, while computational host-depletion methods rely on memory-intensive reference index databases and struggle to accurately classify noisy sequence data. To solve these challenges, we propos  ...[more]

Similar Datasets

2021-06-02 | GSE175942 | GEO
2021-06-01 | GSE171549 | GEO
| S-EPMC5905914 | biostudies-other
| S-EPMC5561031 | biostudies-other
| S-EPMC9893444 | biostudies-literature
| S-EPMC11367643 | biostudies-literature
| S-EPMC8268518 | biostudies-literature
2022-09-14 | E-MTAB-11607 | biostudies-arrayexpress
| S-EPMC7256406 | biostudies-literature
| S-EPMC7317629 | biostudies-literature