Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic Micromonas Populations.
ABSTRACT: Spliceosomal introns are a hallmark of eukaryotic genes that are hypothesized to play important roles in genome evolution but have poorly understood origins. Although most introns lack sequence homology to each other, new families of spliceosomal introns that are repeated hundreds of times in individual genomes have recently been discovered in a few organisms. The prevalence and conservation of these introner elements (IEs) or introner-like elements in other taxa, as well as their evolutionary relationships to regular spliceosomal introns, are still unknown. Here, we systematically investigate introns in the widespread marine green alga Micromonas and report new families of IEs, numerous intron presence-absence polymorphisms, and potential intron insertion hot-spots. The new families enabled identification of conserved IE secondary structure features and establishment of a novel general model for repetitive intron proliferation across genomes. Despite shared secondary structure, the IE families from each Micromonas lineage bear no obvious sequence similarity to those in the other lineages, suggesting that their appearance is intimately linked with the process of speciation. Two of the new IE families come from an Arctic culture (Micromonas Clade E2) isolated from a polar region where abundance of this alga is increasing due to climate induced changes. The same two families were detected in metagenomic data from Antarctica--a system where Micromonas has never before been reported. Strikingly high identity between the Arctic isolate and Antarctic coding sequences that flank the IEs suggests connectivity between populations in the two polar systems that we postulate occurs through deep-sea currents. Recovery of Clade E2 sequences in North Atlantic Deep Waters beneath the Gulf Stream supports this hypothesis. Our research illuminates the dynamic relationships between an unusual class of repetitive introns, genome evolution, speciation, and global distribution of this sentinel marine alga.
Project description:Genes in pieces and spliceosomal introns are a landmark of eukaryotes, with intron invasion usually assumed to have happened early on in evolution. Here, we analyze the intron landscape of Micromonas, a unicellular green alga in the Mamiellophyceae lineage, demonstrating the coexistence of several classes of introns and the occurrence of recent massive intron invasion. This study focuses on two strains, CCMP1545 and RCC299, and their related individuals from ocean samplings, showing that they not only harbor different classes of introns depending on their location in the genome, as for other Mamiellophyceae, but also uniquely carry several classes of repeat introns. These introns, dubbed introner elements (IEs), are found at novel positions in genes and have conserved sequences, contrary to canonical introns. This IE invasion has a huge impact on the genome, doubling the number of introns in the CCMP1545 strain. We hypothesize that each IE class originated from a single ancestral IE that has been colonizing the genome after strain divergence by inserting copies of itself into genes by intron transposition, likely involving reverse splicing. Along with similar cases recently observed in other organisms, our observations in Micromonas strains shed a new light on the evolution of introns, suggesting that intron gain is more widespread than previously thought.
Project description:As part of the exploratory sequencing program Génolevures, visual scrutinisation and bioinformatic tools were used to detect spliceosomal introns in seven hemiascomycetous yeast species. A total of 153 putative novel introns were identified. Introns are rare in yeast nuclear genes (<5% have an intron), mainly located at the 5' end of ORFs, and not highly conserved in sequence. They all share a clear non-random vocabulary: conserved splice sites and conserved nucleotide contexts around splice sites. Homologues of metazoan snRNAs and putative homologues of SR splicing factors were identified, confirming that the spliceosomal machinery is highly conserved in eukaryotes. Several introns' features were tested as possible markers for phylogenetic analysis. We found that intron sizes vary widely within each genome, and according to the phylogenetic position of the yeast species. The evolutionary origin of spliceosomal introns was examined by analysing the degree of conservation of intron positions in homologous yeast genes. Most introns appeared to exist in the last common ancestor of present day yeast species, and then to have been differentially lost during speciation. However, in some cases, it is difficult to exclude a possible sliding event affecting a pre-existing intron or a gain of a novel intron. Taken together, our results indicate that the origin of spliceosomal introns is complex within a given genome, and that present day introns may have resulted from a dynamic flux between intron conservation, intron loss and intron gain during the evolution of hemiascomycetous yeasts.
Project description:The role of spliceosomal introns in eukaryotic genomes remains obscure. A large scale analysis of intron presence/absence patterns in many gene families and species is a necessary step to clarify the role of these introns. In this analysis, we used a maximum likelihood method to reconstruct the evolution of 2,961 introns in a dataset of 76 ribosomal protein genes from 22 eukaryotes and validated the results by a maximum parsimony method. Our results show that the trends of intron gain and loss differed across species in a given kingdom but appeared to be consistent within subphyla. Most subphyla in the dataset diverged around 1 billion years ago, when the "Big Bang" radiation occurred. We speculate that spliceosomal introns may play a role in the explosion of many eukaryotes at the Big Bang radiation.
Project description:BACKGROUND:We have studied spliceosomal introns in the ribosomal (r)RNA of fungi to discover the forces that guide their insertion and fixation. RESULTS:Comparative analyses of flanking sequences at 49 different spliceosomal intron sites showed that the G - intron - G motif is the conserved flanking sequence at sites of intron insertion. Information analysis showed that these rRNA introns contain significant information in the flanking exons. Analysis of all rDNA introns in the three phylogenetic domains and two organelles showed that group I introns are usually located after the most conserved sites in rRNA, whereas spliceosomal introns occur at less conserved positions. The distribution of spliceosomal and group I introns in the primary structure of small and large subunit rRNAs was tested with simulations using the broken-stick model as the null hypothesis. This analysis suggested that the spliceosomal and group I intron distributions were not produced by a random process. Sequence upstream of rRNA spliceosomal introns was significantly enriched in G nucleotides. We speculate that these G-rich regions may function as exonic splicing enhancers that guide the spliceosome and facilitate splicing. CONCLUSIONS:Our results begin to define some of the rules that guide the distribution of rRNA spliceosomal introns and suggest that the exon context is of fundamental importance in intron fixation.
Project description:BACKGROUND:Two spliceosomal intron types co-exist in eukaryotic precursor mRNAs and are excised by distinct U2-dependent and U12-dependent spliceosomes. In the diplomonad Giardia lamblia, small nuclear (sn) RNAs show hybrid characteristics of U2- and U12-dependent spliceosomal snRNAs and 5 of 11 identified remaining spliceosomal introns are trans-spliced. It is unknown whether unusual intron and spliceosome features are conserved in other diplomonads. RESULTS:We have identified spliceosomal introns, snRNAs and proteins from two additional diplomonads for which genome information is currently available, Spironucleus vortens and Spironucleus salmonicida, as well as relatives, including 6 verified cis-spliceosomal introns in S. vortens. Intron splicing signals are mostly conserved between the Spironucleus species and G. lamblia. Similar to 'long' G. lamblia introns, RNA secondary structural potential is evident for 'long' (>?50?nt) Spironucleus introns as well as introns identified in the parabasalid Trichomonas vaginalis. Base pairing within these introns is predicted to constrain spatial distances between splice junctions to similar distances seen in the shorter and uniformly-sized introns in these organisms. We find that several remaining Spironucleus spliceosomal introns are ancient. We identified a candidate U2 snRNA from S. vortens, and U2 and U5 snRNAs in S. salmonicida; cumulatively, illustrating significant snRNA differences within some diplomonads. Finally, we studied spliceosomal protein complements and find protein sets in Giardia, Spironucleus and Trepomonas sp. PC1 highly- reduced but well conserved across the clade, with between 44 and 62 out of 174 studied spliceosomal proteins detectable. Comparison with more distant relatives revealed a highly nested pattern, with the more intron-rich fornicate Kipferlia bialata retaining 87 total proteins including nearly all those observed in the diplomonad representatives, and the oxymonad Monocercomonoides retaining 115 total proteins including nearly all those observed in K. bialata. CONCLUSIONS:Comparisons in diplomonad representatives and species of other closely-related metamonad groups indicates similar patterns of intron structural conservation and spliceosomal protein composition but significant divergence of snRNA structure in genomically-reduced species. Relative to other eukaryotes, loss of evolutionarily-conserved snRNA domains and common sets of spliceosomal proteins point to a more streamlined splicing mechanism, where intron sequences and structures may be functionally compensating for the minimalization of spliceosome components.
Project description:BACKGROUND: Spliceosomal introns are an ancient, widespread hallmark of eukaryotic genomes. Despite much research, many questions regarding the origin and evolution of spliceosomal introns remain unsolved, partly due to the difficulty of inferring ancestral gene structures. We circumvent this problem by using genes originated by endosymbiotic gene transfer, in which an intron-less structure at the time of the transfer can be assumed. RESULTS: By comparing the exon-intron structures of 64 mitochondrial-derived genes that were transferred to the nucleus at different evolutionary periods, we can trace the history of intron gains in different eukaryotic lineages. Our results show that the intron density of genes transferred relatively recently to the nuclear genome is similar to that of genes originated by more ancient transfers, indicating that gene structure can be rapidly shaped by intron gain after the integration of the gene into the genome and that this process is mainly determined by forces acting specifically on each lineage. We analyze 12 cases of mitochondrial-derived genes that have been transferred to the nucleus independently in more than one lineage. CONCLUSIONS: Remarkably, the proportion of shared intron positions that were gained independently in homologous genes is similar to that proportion observed in genes that were transferred prior to the speciation event and whose shared intron positions might be due to vertical inheritance. A particular case of parallel intron gain in the nad7 gene is discussed in more detail.
Project description:Group II introns are closely linked to eukaryote evolution because nuclear spliceosomal introns and the small RNAs associated with the spliceosome are thought to trace their ancient origins to these mobile elements. Therefore, elucidating how group II introns move, and how they lose mobility can potentially shed light on fundamental aspects of eukaryote biology. To this end, we studied five strains of the unicellular red alga Porphyridium purpureum that surprisingly contain 42 group II introns in their plastid genomes. We focused on a subset of these introns that encode mobility-conferring intron-encoded proteins (IEPs) and found them to be distributed among the strains in a lineage-specific manner. The reverse transcriptase and maturase domains were present in all lineages but the DNA endonuclease domain was deleted in vertically inherited introns, demonstrating a key step in the loss of mobility. P. purpureum plastid intron RNAs had a classic group IIB secondary structure despite variability in the DIII and DVI domains. We report for the first time the presence of twintrons (introns-within-introns, derived from the same mobile element) in Rhodophyta. The P. purpureum IEPs and their mobile introns provide a valuable model for the study of mobile retroelements in eukaryotes and offer promise for biotechnological applications.
Project description:The presence of spliceosomal introns in eukaryotes raises a range of questions about genomic evolution. Along with the fundamental mysteries of introns' initial proliferation and persistence, the evolutionary forces acting on intron sequences remain largely mysterious. Intron number varies across species from a few introns per genome to several introns per gene, and the elements of intron sequences directly implicated in splicing vary from degenerate to strict consensus motifs. We report a 50-species comparative genomic study of intron sequences across most eukaryotic groups. We find two broad and striking patterns. First, we find that some highly intron-poor lineages have undergone evolutionary convergence to strong 3' consensus intron structures. This finding holds for both branch point sequence and distance between the branch point and the 3' splice site. Interestingly, this difference appears to exist within the genomes of green alga of the genus Ostreococcus, which exhibit highly constrained intron sequences through most of the intron-poor genome, but not in one much more intron-dense genomic region. Second, we find evidence that ancestral genomes contained highly variable branch point sequences, similar to more complex modern intron-rich eukaryotic lineages. In addition, ancestral structures are likely to have included polyT tails similar to those in metazoans and plants, which we found in a variety of protist lineages. Intriguingly, intron structure evolution appears to be quite different across lineages experiencing different types of genome reduction: whereas lineages with very few introns tend towards highly regular intronic sequences, lineages with very short introns tend towards highly degenerate sequences. Together, these results attest to the complex nature of ancestral eukaryotic splicing, the qualitatively different evolutionary forces acting on intron structures across modern lineages, and the impressive evolutionary malleability of eukaryotic gene structures.
Project description:BACKGROUND: The origin of spliceosomal introns is the central subject of the introns-early versus introns-late debate. The distribution of intron phases is non-uniform, with an excess of phase-0 introns. Introns-early explains this by speculating that a fraction of present-day introns were present between minigenes in the progenote and therefore must lie in phase-0. In contrast, introns-late predicts that the nonuniformity of intron phase distribution reflects the nonrandomness of intron insertions. RESULTS: In this paper, we tested the two theories using analyses of intron phase distribution. We inferred the evolution of intron phase distribution from a dataset of 684 gene orthologs from seven eukaryotes using a maximum likelihood method. We also tested whether the observed intron phase distributions from 10 eukaryotes can be explained by intron insertions on a genome-wide scale. In contrast to the prediction of introns-early, the inferred evolution of intron phase distribution showed that the proportion of phase-0 introns increased over evolution. Consistent with introns-late, the observed intron phase distributions matched those predicted by an intron insertion model quite well. CONCLUSION: Our results strongly support the introns-late hypothesis of the origin of spliceosomal introns.
Project description:BACKGROUND: Only one spliceosomal-type intron has previously been identified in the unicellular eukaryotic parasite, Giardia lamblia (a diplomonad). This intron is only 35 nucleotides in length and is unusual in possessing a non-canonical 5' intron boundary sequence, CT, instead of GT. RESULTS: We have identified a second spliceosomal-type intron in G. lamblia, in the ribosomal protein L7a gene (Rpl7a), that possesses a canonical GT 5' intron boundary sequence. A comparison of the two known Giardia intron sequences revealed extensive nucleotide identity at both the 5' and 3' intron boundaries, similar to the conserved sequence motifs recently identified at the boundaries of spliceosomal-type introns in Trichomonas vaginalis (a parabasalid). Based on these observations, we searched the partial G. lamblia genome sequence for these conserved features and identified a third spliceosomal intron, in an unassigned open reading frame. Our comprehensive analysis of the Rpl7a intron in other eukaryotic taxa demonstrates that it is evolutionarily conserved and is an ancient eukaryotic intron. CONCLUSION: An analysis of the phylogenetic distribution and properties of the Rpl7a intron suggests its utility as a phylogenetic marker to evaluate particular eukaryotic groupings. Additionally, analysis of the G. lamblia introns has provided further insight into some of the conserved and unique features possessed by the recently identified spliceosomal introns in related organisms such as T. vaginalis and Carpediemonas membranifera.