Positive selection in alternatively spliced exons of human genes.
ABSTRACT: Alternative splicing is a well-recognized mechanism of accelerated genome evolution. We have studied single-nucleotide polymorphisms and human-chimpanzee divergence in the exons of 6672 alternatively spliced human genes, with the aim of understanding the forces driving the evolution of alternatively spliced sequences. Here, we show that alternatively spliced exons and exon fragments (alternative exons) from minor isoforms experience lower selective pressure at the amino acid level, accompanied by selection against synonymous sequence variation. The results of the McDonald-Kreitman test suggest that alternatively spliced exons, unlike exons constitutively included in the mRNA, are also subject to positive selection, with up to 27% of amino acids fixed by positive selection.
Project description:Alu repetitive elements are found in approximately 1.4 million copies in the human genome, comprising more than one-tenth of it. Numerous studies describe exonizations of Alu elements, that is, splicing-mediated insertions of parts of Alu sequences into mature mRNAs. To study the connection between the exonization of Alu elements and alternative splicing, we used a database of ESTs and cDNAs aligned to the human genome. We compiled two exon sets, one of 1176 alternatively spliced internal exons, and another of 4151 constitutively spliced internal exons. Sixty one alternatively spliced internal exons (5.2%) had a significant BLAST hit to an Alu sequence, but none of the constitutively spliced internal exons had such a hit. The vast majority (84%) of the Alu-containing exons that appeared within the coding region of mRNAs caused a frame-shift or a premature termination codon. Alu-containing exons were included in transcripts at lower frequencies than alternatively spliced exons that do not contain an Alu sequence. These results indicate that internal exons that contain an Alu sequence are predominantly, if not exclusively, alternatively spliced. Presumably, evolutionary events that cause a constitutive insertion of an Alu sequence into an mRNA are deleterious and selected against.
Project description:Genetic analyses and systematic mutagenesis have revealed that synonymous, non-synonymous and intronic mutations frequently alter the inclusion levels of alternatively spliced exons, consistent with the concept that altered splicing might be a common mechanism by which mutations cause disease. However, most exons expressed in any cell are highly-included in mature mRNAs. Here, by performing deep mutagenesis of highly-included exons and by analysing the association between genome sequence variation and exon inclusion across the transcriptome, we report that mutations only very rarely alter the inclusion of highly-included exons. This is true for both exonic and intronic mutations as well as for perturbations in trans. Therefore, mutations that affect splicing are not evenly distributed across primary transcripts but are focussed in and around alternatively spliced exons with intermediate inclusion levels. These results provide a resource for prioritising synonymous and other variants as disease-causing mutations.
Project description:Comparison of the sequences of mouse and human genomes revealed a surprising number of nonexonic, nonexpressed conserved sequences, for which no function could be assigned. To study the possible correlation between these conserved intronic sequences and alternative splicing regulation, we developed a method to identify exons that are alternatively spliced in both human and mouse. We compiled two exon sets: one of alternatively spliced conserved exons and another of constitutively spliced conserved exons. We found that 77% of the conserved alternatively spliced exons were flanked on both sides by long conserved intronic sequences. In comparison, only 17% of the conserved constitutively spliced exons were flanked by such conserved intronic sequences. The average length of the conserved intronic sequences was 103 bases in the upstream intron and 94 bases in the downstream intron. The average identity levels in the immediately flanking intronic sequences were 88% and 80% for the upstream and downstream introns, respectively, higher than the conservation levels of 77% that were measured in promoter regions. Our results suggest that the function of many of the intronic sequence blocks that are conserved between human and mouse is the regulation of alternative splicing.
Project description:Alternative splicing of pre-mRNAs is a crucial mechanism for maintaining protein diversity in eukaryotes without requiring a considerable increase of genes in the number. Due to rapid advances in high-throughput sequencing technologies and computational algorithms, it is anticipated that alternative splicing events will be more intensively studied to address different kinds of biological questions. The occurrences of alternative splicing mean that all exons could be classified to be either constitutively or alternatively spliced depending on whether they are virtually included into all mature mRNAs. From an evolutionary point of view, therefore, the alternatively spliced exons would have been associated with distinctive biological characteristics in comparison with constitutively spliced exons. In this paper, we first outline the representative types of alternative splicing events and exon classification, and then review sequence and evolutionary features for the alternatively spliced exons. The main purpose is to facilitate understanding of the biological implications of alternative splicing in eukaryotes. This knowledge is also helpful to establish computational approaches for predicting the splicing pattern of exons.
Project description:Systematic mutagenesis has revealed that synonymous, non-synonymous and intronic mutations frequently alter the inclusion levels of alternatively spliced exons, suggesting that altered splicing might be a common mechanism by which mutations cause disease. However, most exons expressed in any cell are highly-included in mature mRNAs. Here, by performing deep mutagenesis of highly-included exons and by analysing the association between sequence variation and exon inclusion across the genome, we report that mutations only very rarely alter the inclusion of highly-included exons. This is true for both exonic and intronic mutations as well as for perturbations in trans. Therefore, mutations that affect splicing are not evenly distributed across the genome but are focussed in and around alternatively spliced exons with intermediate inclusion levels. These results provide a resource for prioritising synonymous and other variants as disease-causing mutations.
Project description:Alternative splicing and gene duplication are the two main processes responsible for expanding protein functional diversity. Although gene duplication can generate new genes and alternative splicing can introduce variation through alternative gene products, the interplay between the two processes is complex and poorly understood. Here, we have carried out a study of the evolution of alternatively spliced exons after gene duplication to better understand the interaction between the two processes. We created a manually curated set of 97 human genes with mutually exclusively spliced homologous exons and analyzed the evolution of these exons across five distantly related vertebrates (lamprey, spotted gar, zebrafish, fugu, and coelacanth). Most of these exons had an ancient origin (more than 400 Ma). We found examples supporting two extreme evolutionary models for the behaviour of homologous axons after gene duplication. We observed 11 events in which gene duplication was accompanied by splice isoform separation, that is, each paralog specifically conserved just one distinct ancestral homologous exon. At other extreme, we identified genes in which the homologous exons were always conserved within paralogs, suggesting that the alternative splicing event cannot easily be separated from the function in these genes. That many homologous exons fall in between these two extremes highlights the diversity of biological systems and suggests that the subtle balance between alternative splicing and gene duplication is adjusted to the specific cellular context of each gene.
Project description:Circular RNAs (circRNAs) are a relatively new class of RNA molecules, and knowledge about their biogenesis and function is still in its infancy. It was recently shown that alternative splicing underlies the formation of circular RNAs (circRNA) arising from the Titin (TTN) gene. Since the main mechanism by which circRNAs are formed is still unclear, we hypothesized that alternative splicing, and in particular exon skipping, is a major driver of circRNA production. We performed RNA sequencing on human and mouse hearts, mapped alternative splicing events, and overlaid these with expressed circRNAs at exon-level resolution. In addition, we performed RNA sequencing on hearts of Rbm20 KO mice to address how important Rbm20-mediated alternative splicing is in the production of cardiac circRNAs. In human and mouse hearts, we show that cardiac circRNAs are mostly (?90%) produced from constitutive exons and less (?10%) from alternatively spliced exons. In Rbm20 KO hearts, we identified 38 differentially expressed circRNAs of which 12 were produced from the Ttn gene. Even though Ttn appeared the most prominent target of Rbm20 for circularization, we also detected Rbm20-dependent circRNAs arising from other genes including Fan1, Stk39, Xdh, Bcl2l13, and Sorbs1 Interestingly, only Ttn circRNAs seemed to arise from Rbm20-mediated skipped exons. In conclusion, cardiac circRNAs are mostly derived from constitutive exons, suggesting that these circRNAs are generated at the expense of their linear counterpart and that circRNA production impacts the accumulation of the linear mRNA.
Project description:The CD44 molecule is known to display extensive size heterogeneity, which has been attributed both to alternative splicing and to differential glycosylation within the extracellular domain. Although the presence of several alternative exons has been partly inferred from cDNA sequencing, the precise intron-exon organization of the CD44 gene has not been described to date to our knowledge. In the present study we describe the structure of the human CD44 gene, which contains at least 19 exons spanning some 50 kilobases of DNA. We have identified 10 alternatively spliced exons within the extracellular domain, including 1 exon that has not been previously reported. In addition to the inclusion or exclusion of whole exons, more diversity is generated through the utilization of internal splice donor and acceptor sites within 2 of the individual exons. The variation previously reported for the cytoplasmic domain is shown to result from the alternative splicing of 2 exons. The genomic structure of CD44 reveals a remarkable degree of complexity, and we confirm the role of alternative splicing as the basis of the structural and functional diversity seen in the CD44 molecule.
Project description:Mutations in the WNK1 gene, encoding a serine-threonine kinase of the WNK (With No lysine (K)) family, have been implicated in two rare human diseases, Familial Hyperkalemic Hypertension (FHHt) and Hereditary Sensory and Autonomic Neuropathy type 2 (HSAN2). Alternative promoters give rise to a ubiquitous isoform, L-WNK1, and a kidney-specific isoform, KS-WNK1. Several other isoforms are generated through alternative splicing of exons 9, 11 and 12 but their precise tissue distribution is not known. Two additional exons, 8b and HSN2, involved in HSAN2, are thought to be specifically expressed in the nervous system. The purpose of this study was to establish an exhaustive description of all WNK1 isoforms and to quantify their relative level of expression in a panel of human and mouse tissues and in mouse nephron segments. For the latter purpose, we developed a new methodology allowing the determination of the proportions of the different isoforms generated by alternative splicing. Our results evidenced a striking tissue-specific distribution of the different isoforms and the unexpected presence of exon HSN2 in many tissues other than the nervous system. We also found exon 26 to be alternatively spliced in human and identified two new exons, 26a and 26b, within intron 26, specifically expressed in nervous tissues both in humans and mice. WNK1 should therefore no longer be designated as a 28- but as a 32-exon gene, with 8 of them - 8b, HSN2, 9, 11, 12, 26, 26a and 26b - alternatively spliced in a tissue-specific manner. These tissue-specific isoforms must be considered when studying the different roles of this ubiquitous kinase.