Unknown

Dataset Information

0

Meta-Prism 2.0: Enabling algorithm and web server for ultra-fast, memory-efficient, and accurate analysis among millions of microbial community samples.


ABSTRACT: Microbial community samples have been accumulating at a speed faster than ever, with hundreds of thousands of samples been sequenced each year. Mining such a huge amount of multisource heterogeneous data is becoming an increasingly difficult challenge, so efficient and accurate compare and search of samples is in urgent need: faced with millions of samples in the data repository, traditional sample comparison and search approaches fall short in speed and accuracy. Here we proposed Meta-Prism 2.0, a microbial community sample analysis method that has pushed the time and memory efficiency to a new limit without compromising accuracy. Based on sparse data structure, time-saving instruction pipeline, and SIMD optimization, Meta-Prism 2.0 has enabled ultra-fast, memory-efficient, flexible, and accurate search among millions of samples. Meta-Prism 2.0 was put to test on several data sets, with the largest containing 1 million samples. Results show that Meta-Prism 2.0's 0.00001-s per sample pair compare speed and 8-GB memory needs for searching against 1 million samples have made it one of the most efficient sample analysis methods. Additionally, Meta-Prism 2.0 can achieve accuracy comparable with or better than other contemporary methods. Third, Meta-Prism 2.0 can precisely identify the original biome for samples, thus enabling sample source tracking. Finally, we have provided a web server for fast search of microbial community samples online. In summary, Meta-Prism 2.0 has changed the resource-intensive sample search scheme to an effective procedure, which could be conducted by researchers every day even on a laptop, for insightful sample search, similarity analysis, and knowledge discovery. Meta-Prism 2.0 can be accessed at https://github.com/HUST-NingKang-Lab/Meta-Prism-2.0, and the web server can be accessed at https://hust-ningkang-lab.github.io/Meta-Prism-2.0/.

SUBMITTER: Kang K 

PROVIDER: S-EPMC9334027 | biostudies-literature | 2022 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Meta-Prism 2.0: Enabling algorithm and web server for ultra-fast, memory-efficient, and accurate analysis among millions of microbial community samples.

Kang Kai K   Chong Hui H   Ning Kang K  

GigaScience 20220701


<h4>Background</h4>Microbial community samples have been accumulating at a speed faster than ever, with hundreds of thousands of samples been sequenced each year. Mining such a huge amount of multisource heterogeneous data is becoming an increasingly difficult challenge, so efficient and accurate compare and search of samples is in urgent need: faced with millions of samples in the data repository, traditional sample comparison and search approaches fall short in speed and accuracy.<h4>Findings<  ...[more]

Similar Datasets

| S-EPMC8355595 | biostudies-literature
| S-EPMC2744726 | biostudies-literature
| S-EPMC7660903 | biostudies-literature
| S-EPMC9252832 | biostudies-literature
| S-EPMC4410662 | biostudies-literature
| S-EPMC10190105 | biostudies-literature
| S-EPMC3692047 | biostudies-literature
| S-EPMC4987896 | biostudies-literature
| S-EPMC3380062 | biostudies-literature
| S-EPMC1933189 | biostudies-literature