Unknown

Dataset Information

0

GATC: a genetic algorithm for gene tree construction under the Duplication-Transfer-Loss model of evolution.


ABSTRACT: BACKGROUND:Several methods have been developed for the accurate reconstruction of gene trees. Some of them use reconciliation with a species tree to correct, a posteriori, errors in gene trees inferred from multiple sequence alignments. Unfortunately the best fit to sequence information can be lost during this process. RESULTS:We describe GATC, a new algorithm for reconstructing a binary gene tree with branch length. GATC returns optimal solutions according to a measure combining both tree likelihood (according to sequence evolution) and a reconciliation score under the Duplication-Transfer-Loss (DTL) model. It can either be used to construct a gene tree from scratch or to correct trees infered by existing reconstruction method, making it highly flexible to various input data types. The method is based on a genetic algorithm acting on a population of trees at each step. It substantially increases the efficiency of the phylogeny space exploration, reducing the risk of falling into local minima, at a reasonable computational time. We have applied GATC to a dataset of simulated cyanobacterial phylogenies, as well as to an empirical dataset of three reference gene families, and showed that it is able to improve gene tree reconstructions compared with current state-of-the-art algorithms. CONCLUSION:The proposed algorithm is able to accurately reconstruct gene trees and is highly suitable for the construction of reference trees. Our results also highlight the efficiency of multi-objective optimization algorithms for the gene tree reconstruction problem. GATC is available on Github at: https://github.com/UdeM-LBIT/GATC .

SUBMITTER: Noutahi E 

PROVIDER: S-EPMC5954287 | BioStudies | 2018-01-01

SECONDARY ACCESSION(S): 10.5061/dryad.pv6df

REPOSITORIES: biostudies

Similar Datasets

2008-01-01 | S-EPMC3205801 | BioStudies
1000-01-01 | S-EPMC3371857 | BioStudies
2015-01-01 | S-EPMC4380024 | BioStudies
1000-01-01 | S-EPMC3577452 | BioStudies
2018-01-01 | S-EPMC5985597 | BioStudies
2020-01-01 | S-EPMC7249433 | BioStudies
2008-01-01 | S-EPMC2553584 | BioStudies
1000-01-01 | S-EPMC2905365 | BioStudies
1000-01-01 | S-EPMC3105381 | BioStudies
1000-01-01 | S-EPMC5460407 | BioStudies