A phylometagenomic exploration of oceanic alphaproteobacteria reveals mitochondrial relatives unrelated to the SAR11 clade.
ABSTRACT: BACKGROUND: According to the endosymbiont hypothesis, the mitochondrial system for aerobic respiration was derived from an ancestral Alphaproteobacterium. Phylogenetic studies indicate that the mitochondrial ancestor is most closely related to the Rickettsiales. Recently, it was suggested that Candidatus Pelagibacter ubique, a member of the SAR11 clade that is highly abundant in the oceans, is a sister taxon to the mitochondrial-Rickettsiales clade. The availability of ocean metagenome data substantially increases the sampling of Alphaproteobacteria inhabiting the oxygen-containing waters of the oceans that likely resemble the originating environment of mitochondria. METHODOLOGY/PRINCIPAL FINDINGS: We present a phylogenetic study of the origin of mitochondria that incorporates metagenome data from the Global Ocean Sampling (GOS) expedition. We identify mitochondrially related sequences in the GOS dataset that represent a rare group of Alphaproteobacteria, designated OMAC (Oceanic Mitochondria Affiliated Clade) as the closest free-living relatives to mitochondria in the oceans. In addition, our analyses reject the hypothesis that the mitochondrial system for aerobic respiration is affiliated with that of the SAR11 clade. CONCLUSIONS/SIGNIFICANCE: Our results allude to the existence of an alphaproteobacterial clade in the oxygen-rich surface waters of the oceans that represents the closest free-living relative to mitochondria identified thus far. In addition, our findings underscore the importance of expanding the taxonomic diversity in phylogenetic analyses beyond that represented by cultivated bacteria to study the origin of mitochondria.
Project description:Mitochondria share a common ancestor with the Alphaproteobacteria, but determining their precise origins is challenging due to inherent difficulties in phylogenetically reconstructing ancient evolutionary events. Nonetheless, phylogenetic accuracy improves with more refined tools and expanded taxon sampling. We investigated mitochondrial origins with the benefit of new, deeply branching genome sequences from the ancient and prolific SAR11 clade of Alphaproteobacteria and publicly available alphaproteobacterial and mitochondrial genome sequences. Using the automated phylogenomic pipeline Hal, we systematically studied the effect of taxon sampling and missing data to accommodate small mitochondrial genomes. The evidence supports a common origin of mitochondria and SAR11 as a sister group to the Rickettsiales. The simplest explanation of these data is that mitochondria evolved from a planktonic marine alphaproteobacterial lineage that participated in multiple inter-specific cell colonization events, in some cases yielding parasitic relationships, but in at least one case producing a symbiosis that characterizes modern eukaryotic life.
Project description:BACKGROUND: The evolution of the Alphaproteobacteria and origin of the mitochondria are topics of considerable debate. Most studies have placed the mitochondria ancestor within the Rickettsiales order. Ten years ago, the bacterium Odyssella thessalonicensis was isolated from Acanthamoeba spp., and the 16S rDNA phylogeny placed it within the Rickettsiales. Recently, the whole genome of O. thessalonicensis has been sequenced, and 16S rDNA phylogeny and more robust and accurate phylogenomic analyses have been performed with 65 highly conserved proteins. METHODOLOGY/PRINCIPAL FINDINGS: The results suggested that the O. thessalonicensis emerged between the Rickettsiales and other Alphaproteobacteria. The mitochondrial proteins of the Reclinomonas americana have been used to locate the phylogenetic position of the mitochondrion ancestor within the Alphaproteobacteria tree. Using the K tree score method, nine mitochondrion-encoded proteins, whose phylogenies were congruent with the Alphaproteobacteria phylogenomic tree, have been selected and concatenated for Bayesian and Maximum Likelihood phylogenies. The Reclinomonas americana mitochondrion is a sister taxon to the free-living bacteria Candidatus Pelagibacter ubique, and together, they form a clade that is deeply rooted in the Rickettsiales clade. CONCLUSIONS/SIGNIFICANCE: The Reclinomonas americana mitochondrion phylogenomic study confirmed that mitochondria emerged deeply in the Rickettsiales clade and that they are closely related to Candidatus Pelagibacter ubique.
Project description:Mitochondria originated endosymbiotically from an Alphaproteobacteria-like ancestor. However, it is still uncertain which extant group of Alphaproteobacteria is phylogenetically closer to the mitochondrial ancestor. The proposed groups comprise the order Rickettsiales, the family Rhodospirillaceae, and the genus Rickettsia. In this study, we apply a new complex network approach to investigate the evolutionary origins of mitochondria, analyzing protein sequences modules in a critical network obtained through a critical similarity threshold between the studied sequences. The dataset included three ATP synthase subunits (4, 6, and 9) and its alphaproteobacterial homologs (b, a, and c). In all the subunits, the results gave no support to the hypothesis that Rickettsiales are closely related to the mitochondrial ancestor. Our findings support the hypothesis that mitochondria share a common ancestor with a clade containing all Alphaproteobacteria orders, except Rickettsiales.
Project description:Although free living, members of the successful SAR11 group of marine alpha-proteobacteria contain a very small and A+T rich genome, two features that are typical of mitochondria and related obligate intracellular parasites such as the Rickettsiales. Previous phylogenetic analyses have suggested that Candidatus Pelagibacter ubique, the first cultured member of this group, is related to the Rickettsiales+mitochondria clade whereas others disagree with this conclusion. In order to determine the evolutionary position of the SAR11 group and its relationship to the origin of mitochondria, we have performed phylogenetic analyses on the concatenation of 24 proteins from 5 mitochondria and 71 proteobacteria. Our results support that SAR11 group is not the sistergroup of the Rickettsiales+mitochondria clade and confirm that the position of this group in the alpha-proteobacterial tree is strongly affected by tree reconstruction artefacts due to compositional bias. As a consequence, genome reduction and bias toward a high A+T content may have evolved independently in the SAR11 species, which points to a different direction in the quest for the closest relatives to mitochondria and Rickettsiales. In addition, our analyses raise doubts about the monophyly of the newly proposed Pelagibacteraceae family.
Project description:Bacteria in the class Alphaproteobacteria have a wide variety of lifestyles and physiologies. They include pathogens of humans and livestock, agriculturally valuable strains, and several highly abundant marine groups. The ancestor of mitochondria also originated in this clade. Despite significant effort to investigate the phylogeny of the Alphaproteobacteria with a variety of methods, there remains considerable disparity in the placement of several groups. Recent emphasis on phylogenies derived from multiple protein-coding genes remains contentious due to disagreement over appropriate gene selection and the potential influences of systematic error. We revisited previous investigations in this area using concatenated alignments of the small and large subunit (SSU and LSU) rRNA genes, as we show here that these loci have much lower GC bias than whole genomes. This approach has allowed us to update the canonical 16S rRNA gene tree of the Alphaproteobacteria with additional important taxa that were not previously included, and with added resolution provided by concatenating the SSU and LSU genes. We investigated the topological stability of the Alphaproteobacteria by varying alignment methods, rate models, taxon selection and RY-recoding to circumvent GC content bias. We also introduce RYMK-recoding and show that it avoids some of the information loss in RY-recoding. We demonstrate that the topology of the Alphaproteobacteria is sensitive to inclusion of several groups of taxa, but it is less affected by the choice of alignment and rate methods. The majority of topologies and comparative results from Approximately Unbiased tests provide support for positioning the Rickettsiales and the mitochondrial branch within a clade. This composite clade is a sister group to the abundant marine SAR11 clade (Pelagibacterales). Furthermore, we add support for taxonomic assignment of several recently sequenced taxa. Accordingly, we propose three subclasses within the Alphaproteobacteria: the Caulobacteridae, the Rickettsidae, and the Magnetococcidae.
Project description:BACKGROUND: The SAR11 group of Alphaproteobacteria is highly abundant in the oceans. It contains a recently diverged freshwater clade, which offers the opportunity to compare adaptations to salt- and freshwaters in a monophyletic bacterial group. However, there are no cultivated members of the freshwater SAR11 group and no genomes have been sequenced yet. RESULTS: We isolated ten single SAR11 cells from three freshwater lakes and sequenced and assembled their genomes. A phylogeny based on 57 proteins indicates that the cells are organized into distinct microclusters. We show that the freshwater genomes have evolved primarily by the accumulation of nucleotide substitutions and that they have among the lowest ratio of recombination to mutation estimated for bacteria. In contrast, members of the marine SAR11 clade have one of the highest ratios. Additional metagenome reads from six lakes confirm low recombination frequencies for the genome overall and reveal lake-specific variations in microcluster abundances. We identify hypervariable regions with gene contents broadly similar to those in the hypervariable regions of the marine isolates, containing genes putatively coding for cell surface molecules. CONCLUSIONS: We conclude that recombination rates differ dramatically in phylogenetic sister groups of the SAR11 clade adapted to freshwater and marine ecosystems. The results suggest that the transition from marine to freshwater systems has purged diversity and resulted in reduced opportunities for recombination with divergent members of the clade. The low recombination frequencies of the LD12 clade resemble the low genetic divergence of host-restricted pathogens that have recently shifted to a new host.
Project description:Planktonic bacterial lineages with streamlined genomes are prevalent in the ocean. The base composition of their DNA is often highly biased towards low G+C content, a possible source of systematic error in phylogenetic reconstruction. A total of 228 orthologous protein families were sampled that are shared among major lineages of Alphaproteobacteria, including the marine free-living SAR11 clade and the obligate endosymbiotic Rickettsiales. These two ecologically distinct lineages share genome sizes of <1.5?Mbp and genomic G+C content of <30%. Statistical analyses showed that only 28 protein families are composition-homogeneous, whereas the other 200 families significantly violate the composition-homogeneous assumption included in most phylogenetic methods. RAxML analysis based on the concatenation of 24 ribosomal proteins that fall into the heterogeneous protein category clustered the SAR11 and Rickettsiales lineages at the base of the Alphaproteobacteria tree, whereas that based on the concatenation of 28 homogeneous proteins (including 19 ribosomal proteins) disassociated the lineages and placed SAR11 at the base of the non-endosymbiotic lineages. When the two data sets were concatenated, only a model that accounted for compositional bias yielded a tree identical to the tree built with composition-homogeneous proteins. Ancestral genome analysis suggests that the first evolved SAR11 cell had a small genome streamlined from its ancestor by a factor of two and coinciding with an ecological transition, followed by further gradual streamlining towards the extant SAR11 populations.
Project description:Molecular phylogenetics and phylogenomics are subject to noise from horizontal gene transfer (HGT) and bias from convergence in macromolecular compositions. Extensive variation in size, structure and base composition of alphaproteobacterial genomes has complicated their phylogenomics, sparking controversy over the origins and closest relatives of the SAR11 strains. SAR11 are highly abundant, cosmopolitan aquatic Alphaproteobacteria with streamlined, A+T-biased genomes. A dominant view holds that SAR11 are monophyletic and related to both Rickettsiales and the ancestor of mitochondria. Other studies dispute this, finding evidence of a polyphyletic origin of SAR11 with most strains distantly related to Rickettsiales. Although careful evolutionary modeling can reduce bias and noise in phylogenomic inference, entirely different approaches may be useful to extract robust phylogenetic signals from genomes. Here we develop simple phyloclassifiers from bioinformatically derived tRNA Class-Informative Features (CIFs), features predicted to target tRNAs for specific interactions within the tRNA interaction network. Our tRNA CIF-based model robustly and accurately classifies alphaproteobacterial genomes into one of seven undisputed monophyletic orders or families, despite great variability in tRNA gene complement sizes and base compositions. Our model robustly rejects monophyly of SAR11, classifying all but one strain as Rhizobiales with strong statistical support. Yet remarkably, conventional phylogenetic analysis of tRNAs classifies all SAR11 strains identically as Rickettsiales. We attribute this discrepancy to convergence of SAR11 and Rickettsiales tRNA base compositions. Thus, tRNA CIFs appear more robust to compositional convergence than tRNA sequences generally. Our results suggest that tRNA-CIF-based phyloclassification is robust to HGT of components of the tRNA interaction network, such as aminoacyl-tRNA synthetases. We explain why tRNAs are especially advantageous for prediction of traits governing macromolecular interactions from genomic data, and why such traits may be advantageous in the search for robust signals to address difficult problems in classification and phylogeny.
Project description:Overwhelming evidence supports the endosymbiosis theory that mitochondria originated once from the Alphaproteobacteria. However, its exact position in the tree of life remains highly debated. This is because systematic errors, including biased taxonomic sampling, high evolutionary rates and sequence composition bias have long plagued the mitochondrial phylogenetics. In this study, we address this issue by 1) increasing the taxonomic representation of alphaproteobacterial genomes by sequencing 18 phylogenetically novel species. They include 5 Rickettsiales and 4 Rhodospirillales, two orders that have shown close affiliations with mitochondria previously, 2) using a set of 29 slowly evolving mitochondria-derived nuclear genes that are less biased than mitochondria-encoded genes as the alternative "well behaved" markers for phylogenetic analysis, 3) applying site heterogeneous mixture models that account for the sequence composition bias. With the integrated phylogenomic approach, we are able to for the first time place mitochondria unequivocally within the Rickettsiales order, as a sister clade to the Rickettsiaceae and Anaplasmataceae families, all subtended by the Holosporaceae family. Our results suggest that mitochondria most likely originated from a Rickettsiales endosymbiont already residing in the host, but not from the distantly related free-living Pelagibacter and Rhodospirillales.
Project description:The ubiquitous SAR11 bacterial clade is the most abundant type of organism in the world's oceans, but the reasons for its success are not fully elucidated. We analysed 128 surface marine metagenomes, including 37 new Antarctic metagenomes. The large size of the data set enabled internal transcribed spacer (ITS) regions to be obtained from the Southern polar region, enabling the first global characterization of the distribution of SAR11, from waters spanning temperatures -2 to 30°C. Our data show a stable co-occurrence of phylotypes within both 'tropical' (>20°C) and 'polar' (<10°C) biomes, highlighting ecological niche differentiation between major SAR11 subgroups. All phylotypes display transitions in abundance that are strongly correlated with temperature and latitude. By assembling SAR11 genomes from Antarctic metagenome data, we identified specific genes, biases in gene functions and signatures of positive selection in the genomes of the polar SAR11-genomic signatures of adaptive radiation. Our data demonstrate the importance of adaptive radiation in the organism's ability to proliferate throughout the world's oceans, and describe genomic traits characteristic of different phylotypes in specific marine biomes.