ABSTRACT: Alu repetitive elements are found in approximately 1.4 million copies in the human genome, comprising more than one-tenth of it. Numerous studies describe exonizations of Alu elements, that is, splicing-mediated insertions of parts of Alu sequences into mature mRNAs. To study the connection between the exonization of Alu elements and alternative splicing, we used a database of ESTs and cDNAs aligned to the human genome. We compiled two exon sets, one of 1176 alternatively spliced internal exons, and another of 4151 constitutively spliced internal exons. Sixty one alternatively spliced internal exons (5.2%) had a significant BLAST hit to an Alu sequence, but none of the constitutively spliced internal exons had such a hit. The vast majority (84%) of the Alu-containing exons that appeared within the coding region of mRNAs caused a frame-shift or a premature termination codon. Alu-containing exons were included in transcripts at lower frequencies than alternatively spliced exons that do not contain an Alu sequence. These results indicate that internal exons that contain an Alu sequence are predominantly, if not exclusively, alternatively spliced. Presumably, evolutionary events that cause a constitutive insertion of an Alu sequence into an mRNA are deleterious and selected against.
Project description:Exonization of retroposed mobile elements, a process whereby new exons are generated following changes in non-protein-coding regions of a gene, is thought to have great potential for generating proteins with novel domains. Our previous analysis of primate-specific Alu-short interspersed elements (SINEs) showed, however, that during their 60 million years of evolution, SINE exonizations occurred in some primates, only to be lost again in some of the descendent lineages. This dynamic gain and loss makes it difficult to ascertain the contribution of exonization to genomic novelty. It was speculated that Alu-SINEs are too young to reveal persistent protein exaptation. In the present study we examined older mobile elements, mammalian-wide interspersed repeats (MIRs) that underwent active retroposition prior to the placental mammalian radiation approximately 130 million years ago, to determine their contribution to protein-coding sequences. Of 107 potential cases of MIR exonizations in human, an analysis of splice sites substantiates a mechanism that benefits from 3' splice site selection in MIR sequences. We retraced in detail the evolution of five MIR elements that exonized at different times during mammalian evolution. Four of these are expressed as alternatively spliced transcripts; three in species throughout the mammalian phylogenetic tree and one solely in primates. The fifth is the first experimentally verified, constitutively expressed retroposed SINE element in mammals. This pattern of highly conserved, alternatively and constitutively spliced MIR sequences evinces the potential of exonized transposed elements to evolve beyond the transient state found in Alu-SINEs and persist as important parts of functional proteins.
Project description:Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.
Project description:More than 5% of alternatively spliced internal exons in the human genome are derived from Alu elements in a process termed exonization. Alus are comprised of two homologous arms separated by an internal polypyrimidine tract (PPT). In most exonizations, splice sites are selected from within the same arm. We hypothesized that the internal PPT may prevent selection of a splice site further downstream. Here, we demonstrate that this PPT enhanced the selection of an upstream 5' splice site (5'ss), even in the presence of a stronger 5'ss downstream. Deletion of this PPT shifted selection to the stronger downstream 5'ss. This enhancing effect depended on the strength of the downstream 5'ss, on the efficiency of base-pairing to U1 snRNA, and on the length of the PPT. This effect of the PPT was mediated by the binding of TIA proteins and was dependent on the distance between the PPT and the upstream 5'ss. A wide-scale evolutionary analysis of introns across 22 eukaryotes revealed an enrichment in PPTs within approximately 20 nt downstream of the 5'ss. For most metazoans, the strength of the 5'ss inversely correlated with the presence of a downstream PPT, indicative of the functional role of the PPT. Finally, we found that the proteins that mediate this effect, TIA and U1C, and in particular their functional domains, are highly conserved across evolution. Overall, these findings expand our understanding of the role of TIA1/TIAR proteins in enhancing recognition of exons, in general, and Alu exons, in particular.
Project description:Exonization of Alu elements is a major mechanism for birth of new exons in primate genomes. Prior analyses of expressed sequence tags show that almost all Alu-derived exons are alternatively spliced, and the vast majority of these exons have low transcript inclusion levels. In this work, we provide genomic and experimental evidence for diverse splicing patterns of exonized Alu elements in human tissues. Using Exon array data of 330 Alu-derived exons in 11 human tissues and detailed RT-PCR analyses of 38 exons, we show that some Alu-derived exons are constitutively spliced in a broad range of human tissues, and some display strong tissue-specific switch in their transcript inclusion levels. Most of such exons are derived from ancient Alu elements in the genome. In SEPN1, mutations of which are linked to a form of congenital muscular dystrophy, the muscle-specific inclusion of an Alu-derived exon may be important for regulating SEPN1 activity in muscle. Realtime qPCR analysis of this SEPN1 exon in macaque and chimpanzee tissues indicates human-specific increase in its transcript inclusion level and muscle specificity after the divergence of humans and chimpanzees. Our results imply that some Alu exonization events may have acquired adaptive benefits during the evolution of primate transcriptomes.
Project description:Examination of the human transcriptome reveals higher levels of RNA editing than in any other organism tested to date. This is indicative of extensive double-stranded RNA (dsRNA) formation within the human transcriptome. Most of the editing sites are located in the primate-specific retrotransposed element called Alu. A large fraction of Alus are found in intronic sequences, implying extensive Alu-Alu dsRNA formation in mRNA precursors. Yet, the effect of these intronic Alus on splicing of the flanking exons is largely unknown. Here, we show that more Alus flank alternatively spliced exons than constitutively spliced ones; this is especially notable for those exons that have changed their mode of splicing from constitutive to alternative during human evolution. This implies that Alu insertions may change the mode of splicing of the flanking exons. Indeed, we demonstrate experimentally that two Alu elements that were inserted into an intron in opposite orientation undergo base-pairing, as evident by RNA editing, and affect the splicing patterns of a downstream exon, shifting it from constitutive to alternative. Our results indicate the importance of intronic Alus in influencing the splicing of flanking exons, further emphasizing the role of Alus in shaping of the human transcriptome.
Project description:BACKGROUND: Transposable elements (TEs) have played an important role in the diversification and enrichment of mammalian transcriptomes through various mechanisms such as exonization and intronization (the birth of new exons/introns from previously intronic/exonic sequences, respectively), and insertion into first and last exons. However, no extensive analysis has compared the effects of TEs on the transcriptomes of mammals, non-mammalian vertebrates and invertebrates. RESULTS: We analyzed the influence of TEs on the transcriptomes of five species, three invertebrates and two non-mammalian vertebrates. Compared to previously analyzed mammals, there were lower levels of TE introduction into introns, significantly lower numbers of exonizations originating from TEs and a lower percentage of TE insertion within the first and last exons. Although the transcriptomes of vertebrates exhibit significant levels of exonization of TEs, only anecdotal cases were found in invertebrates. In vertebrates, as in mammals, the exonized TEs are mostly alternatively spliced, indicating that selective pressure maintains the original mRNA product generated from such genes. CONCLUSIONS: Exonization of TEs is widespread in mammals, less so in non-mammalian vertebrates, and very low in invertebrates. We assume that the exonization process depends on the length of introns. Vertebrates, unlike invertebrates, are characterized by long introns and short internal exons. Our results suggest that there is a direct link between the length of introns and exonization of TEs and that this process became more prevalent following the appearance of mammals.
Project description:BACKGROUND: A large proportion of species-specific exons are alternatively spliced. In primates, Alu elements play a crucial role in the process of exon creation but many new exons have appeared through other mechanisms. Despite many recent studies, it is still unclear which are the splicing regulatory requirements for de novo exonization and how splicing regulation changes throughout an exon's lifespan. RESULTS: Using comparative genomics, we have defined sets of exons with different evolutionary ages. Younger exons have weaker splice-sites and lower absolute values for the relative abundance of putative splicing regulators between exonic and adjacent intronic regions, indicating a less consolidated splicing regulation. This relative abundance is shown to increase with exon age, leading to higher exon inclusion. We show that this local difference in the density of regulators might be of biological significance, as it outperforms other measures in real exon versus pseudo-exon classification. We apply this new measure to the specific case of the exonization of anti-sense Alu elements and show that they are characterized by a general lack of exonic splicing silencers. CONCLUSIONS: Our results suggest that specific sequence environments are required for exonization and that these can change with time. We propose a model of exon creation and establishment in human genes, in which splicing decisions depend on the relative local abundance of regulatory motifs. Using this model, we provide further explanation as to why Alu elements serve as a major substrate for exon creation in primates. Finally, we discuss the benefits of integrating such information in gene prediction.
Project description:Human internal exons have an average size of 147 nt, and most are <300 nt. This small size is thought to facilitate exon definition. A small number of large internal exons have been identified and shown to be alternatively spliced. We identified 1115 internal exons >1000 nt in the human genome; these were found in 5% of all protein-coding genes, and most were expressed and translated. Surprisingly, 40% of these were expressed at levels similar to the flanking exons, suggesting they were constitutively spliced. While all of the large exons had strong splice sites, the constitutively spliced large exons had a higher ratio of splicing enhancers/silencers and were more conserved across mammals than the alternatively spliced large exons. We asked if large exons contain specific sequences that promote splicing and identified 38 sequences enriched in the large exons relative to small exons. The consensus sequence is C-rich with a central invariant CA dinucleotide. Mutation of these sequences in a candidate large exon indicated that these are important for recognition of large exons by the splicing machinery. We propose that these sequences are large exon splicing enhancers (LESEs).
Project description:Transcriptional isoforms are not just random combinations of exons. What has caused exons to be differentially spliced and whether exons with different splicing frequencies are subjected to divergent regulation by potential elements or splicing signals? Beyond the conventional classification for alternatively spliced exons (ASEs) and constitutively spliced exons (CSEs), we have classified exons from alternatively spliced human genes and their mouse orthologs (12,314 and 5,464, respectively) into four types based on their splicing frequencies. Analysis has indicated that different groups of exons presented divergent compositional and regulatory properties. Interestingly, with the decrease of splicing frequency, exons tend to have greater lengths, higher GC content, and contain more splicing elements and repetitive elements, which seem to imply that the splicing frequency is influenced by such factors. Comparison of non-alternatively spliced (NAS) mouse genes with alternatively spliced human orthologs also suggested that exons with lower splicing frequencies may be newly evolved ones which gained functions with splicing frequencies altered through the evolution. Our findings have revealed for the first time that certain factors may have critical influence on the splicing frequency, suggesting that exons with lower splicing frequencies may originate from old repetitive sequences, with splicing sites altered by mutation, gaining novel functions and become more frequently spliced.
Project description:Exonic splicing enhancers (ESEs) are pre-mRNA cis-acting elements required for splice-site recognition. We previously developed a web-based program called ESEfinder that scores any sequence for the presence of ESE motifs recognized by the human SR proteins SF2/ASF, SRp40, SRp55 and SC35 (http://rulai.cshl.edu/tools/ESE/). Using ESEfinder, we have undertaken a large-scale analysis of ESE motif distribution in human protein-coding genes. Significantly higher frequencies of ESE motifs were observed in constitutive internal protein-coding exons, compared with both their flanking intronic regions and with pseudo exons. Statistical analysis of ESE motif frequency distributions revealed a complex relationship between splice-site strength and increased or decreased frequencies of particular SR protein motifs. Comparison of constitutively and alternatively spliced exons demonstrated slightly weaker splice-site scores, as well as significantly fewer ESE motifs, in the alternatively spliced group. Our results underline the importance of ESE-mediated SR protein function in the process of exon definition, in the context of both constitutive splicing and regulated alternative splicing.