Unknown

Dataset Information

0

Improved Large-Scale Homology Search by Two-Step Seed Search Using Multiple Reduced Amino Acid Alphabets.


ABSTRACT: Metagenomic analysis, a technique used to comprehensively analyze microorganisms present in the environment, requires performing high-precision homology searches on large amounts of sequencing data, the size of which has increased dramatically with the development of next-generation sequencing. NCBI BLAST is the most widely used software for performing homology searches, but its speed is insufficient for the throughput of current DNA sequencers. In this paper, we propose a new, high-performance homology search algorithm that employs a two-step seed search strategy using multiple reduced amino acid alphabets to identify highly similar subsequences. Additionally, we evaluated the validity of the proposed method against several existing tools. Our method was faster than any other existing program for ≤120,000 queries, while DIAMOND, an existing tool, was the fastest method for >120,000 queries.

SUBMITTER: Takabatake K 

PROVIDER: S-EPMC8469100 | biostudies-literature | 2021 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improved Large-Scale Homology Search by Two-Step Seed Search Using Multiple Reduced Amino Acid Alphabets.

Takabatake Kazuki K   Izawa Kazuki K   Akikawa Motohiro M   Yanagisawa Keisuke K   Ohue Masahito M   Akiyama Yutaka Y  

Genes 20210921 9


Metagenomic analysis, a technique used to comprehensively analyze microorganisms present in the environment, requires performing high-precision homology searches on large amounts of sequencing data, the size of which has increased dramatically with the development of next-generation sequencing. NCBI BLAST is the most widely used software for performing homology searches, but its speed is insufficient for the throughput of current DNA sequencers. In this paper, we propose a new, high-performance  ...[more]

Similar Datasets

| S-EPMC2732308 | biostudies-literature
| S-EPMC10872054 | biostudies-literature
| S-EPMC9284397 | biostudies-literature
| S-EPMC2034607 | biostudies-literature
| S-EPMC11262835 | biostudies-literature
| S-EPMC10680702 | biostudies-literature
| S-EPMC6610992 | biostudies-literature
| S-EPMC2936399 | biostudies-literature
| S-EPMC3311100 | biostudies-literature
| S-EPMC5866580 | biostudies-literature