Dataset Information

DecentTree: scalable Neighbour-Joining for the genomic era.

ABSTRACT:

Motivation

Neighbour-Joining is one of the most widely used distance-based phylogenetic inference methods. However, current implementations do not scale well for datasets with more than 10 000 sequences. Given the increasing pace of generating new sequence data, particularly in outbreaks of emerging diseases, and the already enormous existing databases of sequence data for which Neighbour-Joining is a useful approach, new implementations of existing methods are warranted.

Results

Here, we present DecentTree, which provides highly optimized and parallel implementations of Neighbour-Joining and several of its variants. DecentTree is designed as a stand-alone application and a header-only library easily integrated with other phylogenetic software (e.g. it is integral in the popular IQ-TREE software). We show that DecentTree shows similar or improved performance over existing software (BIONJ, Quicktree, FastME, and RapidNJ), especially for handling very large alignments. For example, DecentTree is up to 6-fold faster than the fastest existing Neighbour-Joining software (e.g. RapidNJ) when generating a tree of 64 000 SARS-CoV-2 genomes.

Availability and implementation

DecentTree is open source and freely available at https://github.com/iqtree/decenttree. All code and data used in this analysis are available on Github (https://github.com/asdcid/Comparison-of-neighbour-joining-software).

SUBMITTER: Wang W

PROVIDER: S-EPMC10491953 | biostudies-literature | 2023 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

DecentTree: scalable Neighbour-Joining for the genomic era.

Wang Weiwen W Barbetti James J Wong Thomas T Thornlow Bryan B Corbett-Detig Russ R Turakhia Yatish Y Lanfear Robert R Minh Bui Quang BQ

Bioinformatics (Oxford, England) 20230901 9

<h4>Motivation</h4>Neighbour-Joining is one of the most widely used distance-based phylogenetic inference methods. However, current implementations do not scale well for datasets with more than 10 000 sequences. Given the increasing pace of generating new sequence data, particularly in outbreaks of emerging diseases, and the already enormous existing databases of sequence data for which Neighbour-Joining is a useful approach, new implementations of existing methods are warranted.<h4>Results</h4> ...[more]

PMID: 37651445

Dataset Information

DecentTree: scalable Neighbour-Joining for the genomic era.

Motivation

Results

Availability and implementation

Publications

DecentTree: scalable Neighbour-Joining for the genomic era.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Scalable neighbour search and alignment with uvaia.
| S-EPMC10924453 | biostudies-literature

Rapid inference of antibiotic resistance and susceptibility by genomic neighbour typing.
| S-EPMC7044115 | biostudies-literature

Clinical Trials in the Genomic Era.
| S-EPMC5652384 | biostudies-other

The era of genomic epidemiology.
| S-EPMC2826447 | biostudies-literature

Genomic reproducibility in the bioinformatics era.
| S-EPMC11312195 | biostudies-literature

Comparative proteomics of three Giardia lamblia strains: Antigenic variation in the post-genomic era
2020-05-20 | PXD017597 | Pride

Measurably evolving pathogens in the genomic era.
| S-EPMC4457702 | biostudies-literature

The Adapting Mind in the Genomic Era.
| S-EPMC4735379 | biostudies-literature

Neuroblastoma treatment in the post-genomic era.
| S-EPMC5299732 | biostudies-literature