Phylogenetic incongruence arising from fragmented speciation in enteric bacteria.
ABSTRACT: Evolutionary relationships among species are often assumed to be fundamentally unambiguous, where genes within a genome are thought to evolve in concert and phylogenetic incongruence between individual orthologs is attributed to idiosyncrasies in their evolution. We have identified substantial incongruence between the phylogenies of orthologous genes in Escherichia, Salmonella, and Citrobacter, or E. coli, E. fergusonii, and E. albertii. The source of incongruence was inferred to be recombination, because individual genes support conflicting topology more robustly than expected from stochastic sequence homoplasies. Clustering of phylogenetically informative sites on the genome indicated that the regions of recombination extended over several kilobases. Analysis of phylogenetically distant taxa resulted in consensus among individual gene phylogenies, suggesting that recombination is not ongoing; instead, conflicting relationships among genes in descendent taxa reflect recombination among their ancestors. Incongruence could have resulted from random assortment of ancestral polymorphisms if species were instantly created from the division of a recombining population. However, the estimated branch lengths in alternative phylogenies would require ancestral populations with far more diversity than is found in extant populations. Rather, these and previous data collectively suggest that genome-wide recombination rates decreased gradually, with variation in rate among loci, leading to pluralistic relationships among their descendent taxa.
Project description:The reconstruction of the Tree of Life has relied almost entirely on concatenation methods, which do not accommodate gene tree heterogeneity, a property that simulations and theory have identified as a likely cause of incongruent phylogenies. However, this incongruence has not yet been demonstrated in empirical studies. Several key relationships among eutherian mammals remain controversial and conflicting among previous studies, including the root of eutherian tree and the relationships within Euarchontoglires and Laurasiatheria. Both bayesian and maximum-likelihood analysis of genome-wide data of 447 nuclear genes from 37 species show that concatenation methods indeed yield strong incongruence in the phylogeny of eutherian mammals, as revealed by subsampling analyses of loci and taxa, which produced strongly conflicting topologies. In contrast, the coalescent methods, which accommodate gene tree heterogeneity, yield a phylogeny that is robust to variable gene and taxon sampling and is congruent with geographic data. The data also demonstrate that incomplete lineage sorting, a major source of gene tree heterogeneity, is relevant to deep-level phylogenies, such as those among eutherian mammals. Our results firmly place the eutherian root between Atlantogenata and Boreoeutheria and support ungulate polyphyly and a sister-group relationship between Scandentia and Primates. This study demonstrates that the incongruence introduced by concatenation methods is a major cause of long-standing uncertainty in the phylogeny of eutherian mammals, and the same may apply to other clades. Our analyses suggest that such incongruence can be resolved using phylogenomic data and coalescent methods that deal explicitly with gene tree heterogeneity.
Project description:The papillomaviruses (PVs) are a family of viruses infecting several mammalian and nonmammalian species that cause cervical cancer in humans. The evolutionary history of the PVs as it associated with a wide range of host species is not well understood. Incongruities between the phylogenetic trees of various viral genes as well as between these genes and the host phylogenies suggest historical viral recombination as well as violations of strict virus-host cospeciation. The extent of recombination events among PVs is uncertain, however, and there is little evidence to support a theory of PV spread via recent host transfers. We have investigated incongruence between PV genes and hence, the possibility of recombination, using Bayesian phylogenetic methods. We find significant evidence for phylogenetic incongruence among the six PV genes E1, E2, E6, E7, L1, and L2, indicating substantial recombination. Analysis of E1 and L1 phylogenies suggests ancestral recombination events. We also describe a new method for examining alternative host-parasite association mechanisms by applying importance sampling to Bayesian divergence time estimation. This new approach is not restricted by a fixed viral tree topology or knowledge of viral divergence times, multiple parasite taxa per host may be included, and it can distinguish between prior divergence of the virus before host speciation and host transfer of the virus following speciation. Using this method, we find prior divergence of PV lineages associated with the ancestral mammalian host resulting in at least 6 PV lineages prior to speciation of this host. These PV lineages have then followed paths of prior divergence and cospeciation to eventually become associated with the extant host species. Only one significant instance of host transfer is supported, the transfer of the ancestral L1 gene between a Primate and Hystricognathi host based on the divergence times between the upsilon human type 41 and porcupine PVs.
Project description:With the increased availability of genome sequences for bacteria, it has become routine practice to construct genome-based phylogenies. These phylogenies have formed the basis for various taxonomic decisions, especially for resolving problematic relationships between taxa. Despite the popularity of concatenating shared genes to obtain well-supported phylogenies, various issues regarding this combined-evidence approach have been raised. These include the introduction of phylogenetic error into datasets, as well as incongruence due to organism-level evolutionary processes, particularly horizontal gene transfer and incomplete lineage sorting. Because of the huge effect that this could have on phylogenies, we evaluated the impact of phylogenetic conflict caused by organism-level evolutionary processes on the established species phylogeny for Pantoea, a member of the Enterobacterales. We explored the presence and distribution of phylogenetic conflict at the gene partition and nucleotide levels, by identifying putative inter-lineage recombination events that might have contributed to such conflict. Furthermore, we determined whether smaller, randomly constructed datasets had sufficient signal to reconstruct the current species tree hypothesis or if they would be overshadowed by phylogenetic incongruence. We found that no individual gene tree was fully congruent with the species phylogeny of Pantoea, although many of the expected nodes were supported by various individual genes across the genome. Evidence of recombination was found across all lineages within Pantoea, and provides support for organism-level evolutionary processes as a potential source of phylogenetic conflict. The phylogenetic signal from at least 70 random genes recovered robust, well-supported phylogenies for the backbone and most species relationships of Pantoea, and was unaffected by phylogenetic conflict within the dataset. Furthermore, despite providing limited resolution among taxa at the level of single gene trees, concatenated analyses of genes that were identified as having no signal resulted in a phylogeny that resembled the species phylogeny of Pantoea. This distribution of signal and noise across the genome presents the ideal situation for phylogenetic inference, as the topology from a ?70-gene concatenated species phylogeny is not driven by single genes, and our data suggests that this finding may also hold true for smaller datasets. We thus argue that, by using a concatenation-based approach in phylogenomics, one can obtain robust phylogenies due to the synergistic effect of the combined signal obtained from multiple genes.
Project description:BACKGROUND:Introgressive events (e.g., hybridization, gene flow, horizontal gene transfer) and incomplete lineage sorting of ancestral polymorphisms are a challenge for phylogenetic analyses since different genes may exhibit conflicting genealogical histories. Grasses of the Triticeae tribe provide a particularly striking example of incongruence among gene trees. Previous phylogenies, mostly inferred with one gene, are in conflict for several taxon positions. Therefore, obtaining a resolved picture of relationships among genera and species of this tribe has been a challenging task. Here, we obtain the most comprehensive molecular dataset to date in Triticeae, including one chloroplastic and 26 nuclear genes. We aim to test whether it is possible to infer phylogenetic relationships in the face of (potentially) large-scale introgressive events and/or incomplete lineage sorting; to identify parts of the evolutionary history that have not evolved in a tree-like manner; and to decipher the biological causes of gene-tree conflicts in this tribe. RESULTS:We obtain resolved phylogenetic hypotheses using the supermatrix and Bayesian Concordance Factors (BCF) approaches despite numerous incongruences among gene trees. These phylogenies suggest the existence of 4-5 major clades within Triticeae, with Psathyrostachys and Hordeum being the deepest genera. In addition, we construct a multigenic network that highlights parts of the Triticeae history that have not evolved in a tree-like manner. Dasypyrum, Heteranthelium and genera of clade V, grouping Secale, Taeniatherum, Triticum and Aegilops, have evolved in a reticulated manner. Their relationships are thus better represented by the multigenic network than by the supermatrix or BCF trees. Noteworthy, we demonstrate that gene-tree incongruences increase with genetic distance and are greater in telomeric than centromeric genes. Together, our results suggest that recombination is the main factor decoupling gene trees from multigenic trees. CONCLUSIONS:Our study is the first to propose a comprehensive, multigenic phylogeny of Triticeae. It clarifies several aspects of the relationships among genera and species of this tribe, and pinpoints biological groups with likely reticulate evolution. Importantly, this study extends previous results obtained in Drosophila by demonstrating that recombination can exacerbate gene-tree conflicts in phylogenetic reconstructions.
Project description:The ciliate genus Spirostomum comprises eight morphospecies, inhabiting diverse aquatic environments worldwide, where they can be used as water quality indicators. Although Spirostomum species are relatively easily identified using morphological methods, the previous nuclear rDNA-based phylogenies indicated several conflicts in morphospecies delineation. Moreover, the single locus phylogenies and previous analytical approaches could not unambiguously resolve phylogenetic relationships among Spirostomum morphospecies. Here, we attempt to investigate species boundaries and evolutionary history of Spirostomum taxa, using 166 new sequences from multiple populations employing one mitochondrial locus (CO1 gene) and two nuclear loci (rRNA operon and alpha-tubulin gene). In accordance with previous studies, relationships among the eight Spirostomum morphospecies were poorly supported statistically in individual gene trees. To overcome this problem, we utilised for the first time in ciliates the Bayesian coalescent approach, which accounts for ancestral polymorphisms, incomplete lineage sorting, and recombination. This strategy enabled us to robustly resolve deep relationships between Spirostomum species and to support the hypothesis that taxa with compact macronucleus and taxa with moniliform macronucleus each form a distinct lineage. Bayesian coalescent-based delimitation analyses strongly statistically supported the traditional morphospecies concept but also indicated that there are two S. minus-like cryptic species and S. teres is non-monophyletic. Spirostomum teres was very likely defined by a set of ancestral features of lineages that also gave rise to S. yagiui and S. dharwarensis. However, molecular data from type populations of the morphospecies S. minus and S. teres are required to unambiguously resolve the taxonomic problems.
Project description:BACKGROUND:Increasing evidence from DNA sequence data has revealed that phylogenies based on different genes may drastically differ from each other. This may be due to either inter- or intralineage processes, or to methodological or stochastic errors. Here we investigate a spectacular case where two parts of the same gene (SlX1/Y1) show conflicting phylogenies within Silene (Caryophyllaceae). SlX1 and SlY1 are sex-linked genes on the sex chromosomes of dioecious members of Silene sect. Elisanthe. RESULTS:We sequenced the homologues of the SlX1/Y1 genes in several Sileneae species. We demonstrate that different parts of the SlX1/Y1 region give different phylogenetic signals. The major discrepancy is that Silene vulgaris and S. sect. Conoimorpha (S. conica and relatives) exchange positions. To determine whether gene duplication followed by recombination (an intralineage process) may explain the phylogenetic conflict in the Silene SlX1/Y1 gene, we use a novel probabilistic, multiple primer-pair PCR approach. We did not find any evidence supporting gene duplication/loss as explanation to the phylogenetic conflict. CONCLUSION:The phylogenetic conflict in the Silene SlX1/Y1 gene cannot be explained by paralogy or artefacts, such as in vitro recombination during PCR. The support for the conflict is strong enough to exclude methodological or stochastic errors as likely sources. Instead, the phylogenetic incongruence may have been caused by recombination of two divergent alleles following ancient interspecific hybridization or incomplete lineage sorting. These events probably took place several million years ago. This example clearly demonstrates that different parts of the genome may have different evolutionary histories and stresses the importance of using multiple genes in reconstruction of taxonomic relationships.
Project description:The megadiverse subfamily Staphylininae traditionally belonged to the best-defined rove beetle taxa, but the advent of molecular phylogenetics in the last decade has brought turbulent changes to the group's classification. Here, we reevaluate the internal relationships among the tribes of Staphylininae by implementing tree inference methods that suppress common sources of systematic error. In congruence with morphological data, and in contrast to some previous phylogenetic studies, we unambiguously recover Staphylininae and Paederinae as monophyletic in the traditional sense. We show that the recently proposed subfamily Platyprosopinae (Arrowinus and Platyprosopus) is a phylogenetic artefact and reinstate Arrowinus as a member of Arrowinini stat. res. and Platyprosopus as a member of Platyprosopini stat. res. We show that several recent changes to the internal classification of the subfamily are phylogenetically unjustified and systematically unnecessary. We, therefore, reestablish Platyprosopini, Staphylinini, and Xantholinini as tribes within Staphylininae (all stat. res.) and recognize Coomaniini as a tribe (stat. nov.) rather than subfamily. Consequently, the traditional ranks of the subtribes Acylophorina, Afroquediina, Amblyopinina, Antimerina, †Baltognathina, Cyrtoquediina, Erichsoniina, Hyptiomina, Indoquediina, Quediina, and Tanygnathinina are restored (all stat. res.). We review the current classification of Staphylininae and discuss sources of incongruence in multigene phylogenies.
Project description:Phylogenetic studies aim to discover evolutionary relationships and histories. These studies are based on similarities of morphological characters and molecular sequences. Currently, widely accepted phylogenetic approaches are based on multiple sequence alignments, which analyze shared gene datasets and concatenate/coalesce these results to a final phylogeny with maximum support. However, these approaches still have limitations, and often have conflicting results with each other. Reconstructing ancestral genomes helps us understand mechanisms and corresponding consequences of evolution. Most existing genome level phylogeny and ancestor reconstruction methods can only process simplified real genome datasets or simulated datasets with identical genome content, unique genome markers, and limited types of evolutionary events. Here, we provide an alternative way to resolve phylogenetic problems based on analyses of real genome data. We use phylogenetic signals from all types of genome level evolutionary events, and overcome the conflicting issues existing in traditional phylogenetic approaches. Further, we build an automated computational pipeline to reconstruct phylogenies and ancestral genomes for two high-resolution real yeast genome datasets. Comparison results with recent studies and publications show that we reconstruct very accurate and robust phylogenies and ancestors. Finally, we identify and analyze the conserved syntenic blocks among reconstructed ancestral genomes and present yeast species.
Project description:When estimating a phylogeny from a multiple sequence alignment, researchers often assume the absence of recombination. However, if recombination is present, then tree estimation and all downstream analyses will be impacted, because different segments of the sequence alignment support different phylogenies. Similarly, convergent selective pressures at the molecular level can also lead to phylogenetic tree incongruence across the sequence alignment. Current methods for detection of phylogenetic incongruence are not equipped to distinguish between these two different mechanisms and assume that the incongruence is a result of recombination or other horizontal transfer of genetic information. We propose a new recombination detection method that can make this distinction, based on synonymous codon substitution distances. Although some power is lost by discarding the information contained in the nonsynonymous substitutions, our new method has lower false positive probabilities than the comparable recombination detection method when the phylogenetic incongruence signal is due to convergent evolution. We apply our method to three empirical examples, where we analyze: (1) sequences from a transmission network of the human immunodeficiency virus, (2) tlpB gene sequences from a geographically diverse set of 38 Helicobacter pylori strains, and (3) hepatitis C virus sequences sampled longitudinally from one patient.
Project description:BACKGROUND:The completion of rice genome sequencing has made rice and its wild relatives an attractive system for biological studies. Despite great efforts, phylogenetic relationships among genome types and species in the rice genus have not been fully resolved. To take full advantage of rice genome resources for biological research and rice breeding, we will benefit from the availability of a robust phylogeny of the rice genus. RESULTS:Through screening rice genome sequences, we sampled and sequenced 142 single-copy genes to clarify the relationships among all diploid genome types of the rice genus. The analysis identified two short internal branches around which most previous phylogenetic inconsistency emerged. These represent two episodes of rapid speciation that occurred approximately 5 and 10 million years ago (Mya) and gave rise to almost the entire diversity of the genus. The known chromosomal distribution of the sampled genes allowed the documentation of whole-genome sorting of ancestral alleles during the rapid speciation, which was responsible primarily for extensive incongruence between gene phylogenies and persisting phylogenetic ambiguity in the genus. Random sample analysis showed that 120 genes with an average length of 874 bp were needed to resolve both short branches with 95% confidence. CONCLUSION:Our phylogenomic analysis successfully resolved the phylogeny of rice genome types, which lays a solid foundation for comparative and functional genomic studies of rice and its relatives. This study also highlights that organismal genomes might be mosaics of conflicting genealogies because of rapid speciation and demonstrates the power of phylogenomics in the reconstruction of rapid diversification.