Expression Changes Confirm Genomic Variants Predicted to Result in Allele-Specific, Alternative mRNA Splicing.
ABSTRACT: Splice isoform structure and abundance can be affected by either noncoding or masquerading coding variants that alter the structure or abundance of transcripts. When these variants are common in the population, these nonconstitutive transcripts are sufficiently frequent so as to resemble naturally occurring, alternative mRNA splicing. Prediction of the effects of such variants has been shown to be accurate using information theory-based methods. Single nucleotide polymorphisms (SNPs) predicted to significantly alter natural and/or cryptic splice site strength were shown to affect gene expression. Splicing changes for known SNP genotypes were confirmed in HapMap lymphoblastoid cell lines with gene expression microarrays and custom designed q-RT-PCR or TaqMan assays. The majority of these SNPs (15 of 22) as well as an independent set of 24 variants were then subjected to RNAseq analysis using the ValidSpliceMut web beacon (http://validsplicemut.cytognomix.com), which is based on data from the Cancer Genome Atlas and International Cancer Genome Consortium. SNPs from different genes analyzed with gene expression microarray and q-RT-PCR exhibited significant changes in affected splice site use. Thirteen SNPs directly affected exon inclusion and 10 altered cryptic site use. Homozygous SNP genotypes resulting in stronger splice sites exhibited higher levels of processed mRNA than alleles associated with weaker sites. Four SNPs exhibited variable expression among individuals with the same genotypes, masking statistically significant expression differences between alleles. Genome-wide information theory and expression analyses (RNAseq) in tumor exomes and genomes confirmed splicing effects for 7 of the HapMap SNP and 14 SNPs identified from tumor genomes. q-RT-PCR resolved rare splice isoforms with read abundance too low for statistical significance in ValidSpliceMut. Nevertheless, the web-beacon provides evidence of unanticipated splicing outcomes, for example, intron retention due to compromised recognition of constitutive splice sites. Thus, ValidSpliceMut and q-RT-PCR represent complementary resources for identification of allele-specific, alternative splicing.
Project description:We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon "Validated Splicing Mutations" either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes.
Project description:The nonmuscle (nm) myosin light-chain kinase isoform (MLCK), encoded by the MYLK gene, is a vital participant in regulating vascular barrier responses to mechanical and inflammatory stimuli. We determined that MYLK is alternatively spliced, yielding functionally distinct nmMLCK splice variants including nmMLCK2, a splice variant highly expressed in vascular endothelial cells (EC) and associated with reduced EC barrier integrity. We demonstrated previously that the nmMLCK2 variant lacks exon 11, which encodes a key regulatory region containing two differentially phosphorylated tyrosine residues (Y464 and Y471) that influence vascular barrier function during inflammation. In this study, we used minigene constructs and RT-PCR to interrogate biophysical factors (mechanical stress) and genetic variants (MYLK single-nucleotide polymorphisms [SNPs]) that are potentially involved in regulating MYLK alternative splicing and nmMLCK2 generation. Human lung EC exposed to pathologic mechanical stress (18% cyclic stretch) produced increased nmMLCK2 expression relative to levels of nmMLCK1 with alternative splicing significantly influenced by MYLK SNPs rs77323602 and rs147245669. In silico analyses predicted that these variants would alter exon 11 donor and acceptor sites for alternative splicing, computational predictions that were confirmed by minigene studies. The introduction of rs77323602 favored wild-type nmMLCK expression, whereas rs147245669 favored alternative splicing and deletion of exon 11, yielding increased nmMLCK2 expression. Finally, lymphoblastoid cell lines selectively harboring these MYLK SNPs (rs77323602 and rs147245669) directly validated SNP-specific effects on MYLK alternative splicing and nmMLCK2 generation. Together, these studies demonstrate that mechanical stress and MYLK SNPs regulate MYLK alternative splicing and generation of a splice variant, nmMLCK2, that contributes to the severity of inflammatory injury.
Project description:BACKGROUND:PALB2 monoallelic loss-of-function germ-line variants confer a breast cancer risk comparable to the average BRCA2 pathogenic variant. Recommendations for risk reduction strategies in carriers are similar. Elaborating robust criteria to identify loss-of-function variants in PALB2-without incurring overprediction-is thus of paramount clinical relevance. Towards this aim, we have performed a comprehensive characterisation of alternative splicing in PALB2, analysing its relevance for the classification of truncating and splice site variants according to the 2015 American College of Medical Genetics and Genomics-Association for Molecular Pathology guidelines. METHODS:Alternative splicing was characterised in RNAs extracted from blood, breast and fimbriae/ovary-related human specimens (n=112). RNAseq, RT-PCR/CE and CloneSeq experiments were performed by five contributing laboratories. Centralised revision/curation was performed to assure high-quality annotations. Additional splicing analyses were performed in PALB2 c.212-1G>A, c.1684+1G>A, c.2748+2T>G, c.3113+5G>A, c.3350+1G>A, c.3350+4A>C and c.3350+5G>A carriers. The impact of the findings on PVS1 status was evaluated for truncating and splice site variant. RESULTS:We identified 88 naturally occurring alternative splicing events (81 newly described), including 4 in-frame events predicted relevant to evaluate PVS1 status of splice site variants. We did not identify tissue-specific alternate gene transcripts in breast or ovarian-related samples, supporting the clinical relevance of blood-based splicing studies. CONCLUSIONS:PVS1 is not necessarily warranted for splice site variants targeting four PALB2 acceptor sites (exons 2, 5, 7 and 10). As a result, rare variants at these splice sites cannot be assumed pathogenic/likely pathogenic without further evidences. Our study puts a warning in up to five PALB2 genetic variants that are currently reported as pathogenic/likely pathogenic in ClinVar.
Project description:Alternative splicing of genes is an efficient means of generating variation in protein function. Several disease states have been associated with rare genetic variants that affect splicing patterns. Conversely, splicing efficiency of some genes is known to vary between individuals without apparent ill effects. What is not clear is whether commonly observed phenotypic variation in splicing patterns, and hence potential variation in protein function, is to a significant extent determined by naturally occurring DNA sequence variation and in particular by single nucleotide polymorphisms (SNPs). In this study, we surveyed the splicing patterns of 250 exons in 22 individuals who had been previously genotyped by the International HapMap Project. We identified 70 simple cassette exon alternative splicing events in our experimental system; for six of these, we detected consistent differences in splicing pattern between individuals, with a highly significant association between splice phenotype and neighbouring SNPs. Remarkably, for five out of six of these events, the strongest correlation was found with the SNP closest to the intron-exon boundary, although the distance between these SNPs and the intron-exon boundary ranged from 2 bp to greater than 1,000 bp. Two of these SNPs were further investigated using a minigene splicing system, and in each case the SNPs were found to exert cis-acting effects on exon splicing efficiency in vitro. The functional consequences of these SNPs could not be predicted using bioinformatic algorithms. Our findings suggest that phenotypic variation in splicing patterns is determined by the presence of SNPs within flanking introns or exons. Effects on splicing may represent an important mechanism by which SNPs influence gene function.
Project description:BACKGROUND:The claudin 1 tight junction protein, solely responsible for the barrier function of epithelial cells, is frequently down regulated in invasive human breast cancer. The underlying mechanism is largely unknown, and no obvious mutations in the claudin 1 gene (CLDN1) have been identified to date in breast cancer. Since many genes have been shown to undergo deregulation through splicing and mis-splicing events in cancer, the current study was undertaken to investigate the occurrence of transcript variants for CLDN1 in human invasive breast cancer. METHODS:RT-PCR analysis of CLDN1 transcripts was conducted on RNA isolated from 12 human invasive breast tumors. The PCR products from each tumor were resolved by agarose gel electrophoresis, cloned and sequenced. Genomic DNA was also isolated from each of the 12 tumors and amplified using PCR CLDN1 specific primers. Sanger sequencing and single nucleotide polymorphism (SNP) analyses were conducted. RESULTS:A number of CLDN1 transcript variants were identified in these breast tumors. All variants were shorter than the classical CLDN1 transcript. Sequence analysis of the PCR products revealed several splice variants, primarily in exon 1 of CLDN1; resulting in truncated proteins. One variant, V1, resulted in a premature stop codon and thus likely led to nonsense mediated decay. Interestingly, another transcript variant, V2, was not detected in normal breast tissue samples. Further, sequence analysis of the tumor genomic DNA revealed SNPs in 3 of the 4 coding exons, including a rare missense SNP (rs140846629) in exon 2 which represents an Ala124Thr substitution. To our knowledge this is the first report of CLDN1 transcript variants in human invasive breast cancer. These studies suggest that alternate splicing may also be a mechanism by which claudin 1 is down regulated at both the mRNA and protein levels in invasive breast cancer and may provide novel insights into how CLDN1 is reduced or silenced in human breast cancer.
Project description:Alternative pre-mRNA splicing increases proteomic diversity and provides a potential mechanism underlying both phenotypic diversity and susceptibility to genetic disorders in human populations. To investigate the variation in splicing among humans on a genome-wide scale, we use a comprehensive exon-targeted microarray to examine alternative splicing in lymphoblastoid cell lines (LCLs) derived from the CEPH HapMap population. We show the identification of transcripts containing sequence verified exon skipping, intron retention, and cryptic splice site usage that are specific between individuals. A number of novel alternative splicing events with no previous annotations in either the RefSeq and EST databases were identified, indicating that we are able to discover de novo splicing events. Using family-based linkage analysis, we demonstrate Mendelian inheritance and segregation of specific splice isoforms with regulatory haplotypes for three genes: OAS1, CAST, and CRTAP. Allelic association was further used to identify individual SNPs or regulatory haplotype blocks linked to the alternative splicing event, taking advantage of the high-resolution genotype information from the CEPH HapMap population. In one candidate, we identified a regulatory polymorphism that disrupts a 5' splice site of an exon in the CAST gene, resulting in its exclusion in the mutant allele. This report illustrates that our approach can detect both annotated and novel alternatively spliced variants, and that such variation among individuals is heritable and genetically controlled.
Project description:Information theory-based methods have been shown to be sensitive and specific for predicting and quantifying the effects of non-coding mutations in Mendelian diseases. We present the Shannon pipeline software for genome-scale mutation analysis and provide evidence that the software predicts variants affecting mRNA splicing. Individual information contents (in bits) of reference and variant splice sites are compared and significant differences are annotated and prioritized. The software has been implemented for CLC-Bio Genomics platform. Annotation indicates the context of novel mutations as well as common and rare SNPs with splicing effects. Potential natural and cryptic mRNA splicing variants are identified, and null mutations are distinguished from leaky mutations. Mutations and rare SNPs were predicted in genomes of three cancer cell lines (U2OS, U251 and A431), which were supported by expression analyses. After filtering, tractable numbers of potentially deleterious variants are predicted by the software, suitable for further laboratory investigation. In these cell lines, novel functional variants comprised 6-17 inactivating mutations, 1-5 leaky mutations and 6-13 cryptic splicing mutations. Predicted effects were validated by RNA-seq analysis of the three aforementioned cancer cell lines, and expression microarray analysis of SNPs in HapMap cell lines.
Project description:BACKGROUND: BRCA2 germ-line mutations predispose to breast and ovarian cancer. Mutations are widespread and unclassified splice variants are frequently encountered. We describe the parental origin and functional characterization of a novel de novo BRCA2 splice site mutation found in a patient exhibiting a ductal carcinoma at the age of 40. METHODS: Variations were identified by denaturing high performance liquid chromatography (dHPLC) and sequencing of the BRCA1 and BRCA2 genes. The effect of the mutation on splicing was examined by exon trapping in COS-7 cells and by RT-PCR on RNA isolated from whole blood. The paternity was determined by single nucleotide polymorphism (SNP) microarray analysis. Parental origin of the de novo mutation was determined by establishing mutation-SNP haplotypes by variant specific PCR, while de novo and mosaic status was investigated by sequencing of DNA from leucocytes and carcinoma tissue. RESULTS: A novel BRCA2 variant in the splice donor site of exon 21 (nucleotide 8982+1 G-->A/c.8754+1 G-->A) was identified. Exon trapping showed that the mutation activates a cryptic splice site 46 base pairs 3' of exon 21, resulting in the inclusion of a premature stop codon and synthesis of a truncated BRCA2 protein. The aberrant splicing was verified by RT-PCR analysis on RNA isolated from whole blood of the affected patient. The mutation was not found in any of the patient's parents or in the mother's carcinoma, showing it is a de novo mutation. Variant specific PCR indicates that the mutation arose in the male germ-line. CONCLUSION: We conclude that the novel BRCA2 splice variant is a de novo mutation introduced in the male spermatozoa that can be classified as a disease causing mutation.
Project description:This study was carried out for Homo sapiens single variation (SNPs/Indels) in BRAF gene through coding/non-coding regions. Variants data was obtained from database of SNP even last update of November, 2015. Many bioinformatics tools were used to identify functional SNPs and indels in proteins functions, structures and expressions. Results shown, for coding polymorphisms, 111 SNPs predicted as highly damaging and six other were less. For UTRs, showed five SNPs and one indel were altered in micro RNAs binding sites (3' UTR), furthermore nil SNP or indel have functional altered in transcription factor binding sites (5' UTR). In addition for 5'/3' splice sites, analysis showed that one SNP within 5' splice site and one Indel in 3' splice site showed potential alteration of splicing. In conclude these previous functional identified SNPs and indels could lead to gene alteration, which may be directly or indirectly contribute to the occurrence of many diseases.
Project description:Recently, thanks to the increasing throughput of new technologies, we have begun to explore the full extent of alternative pre-mRNA splicing (AS) in the human transcriptome. This is unveiling a vast layer of complexity in isoform-level expression differences between individuals. We used previously published splicing sensitive microarray data from lymphoblastoid cell lines to conduct an in-depth analysis on splicing efficiency of known and predicted exons. By combining publicly available AS annotation with a novel algorithm designed to search for AS, we show that many real AS events can be detected within the usually unexploited, speculative majority of the array and at significance levels much below standard multiple-testing thresholds, demonstrating that the extent of cis-regulated differential splicing between individuals is potentially far greater than previously reported. Specifically, many genes show subtle but significant genetically controlled differences in splice-site usage. PCR validation shows that 42 out of 58 (72%) candidate gene regions undergo detectable AS, amounting to the largest scale validation of isoform eQTLs to date. Targeted sequencing revealed a likely causative SNP in most validated cases. In all 17 incidences where a SNP affected a splice-site region, in silico splice-site strength modeling correctly predicted the direction of the micro-array and PCR results. In 13 other cases, we identified likely causative SNPs disrupting predicted splicing enhancers. Using Fst and REHH analysis, we uncovered significant evidence that 2 putative causative SNPs have undergone recent positive selection. We verified the effect of five SNPs using in vivo minigene assays. This study shows that splicing differences between individuals, including quantitative differences in isoform ratios, are frequent in human populations and that causative SNPs can be identified using in silico predictions. Several cases affected disease-relevant genes and it is likely some of these differences are involved in phenotypic diversity and susceptibility to complex diseases.