BackgroundA very early step in splice site recognition is exon definition, a process that is as yet poorly understood. Communication between the two ends of an exon is thought to be required for this step. We report genome-wide evidence for exons being defined through the combinatorial activity of motifs located in flanking intronic regions.
ResultsStrongly co-occurring motifs were found to specifically reside in four intronic regions surrounding a large number of human exons. These paired motifs occur around constitutive and alternative exons but not pseudo exons. Most co-occurring motifs are limited to intronic regions within 100 nucleotides of the exon. They are preferentially associated with weaker exons. Their pairing is conserved in evolution and they exhibit a lower frequency of single nucleotide polymorphism when paired. Paired motifs display specificity with respect to distance from the exon borders and in constitutive versus alternative splicing. Many resemble binding sites for heterogeneous nuclear ribonucleoproteins. Specific pairs are associated with tissue-specific genes, the higher expression of which coincides with that of the pertinent RNA binding proteins. Tested pairs acted synergistically to enhance exon inclusion, and this enhancement was found to be exon-specific.
ConclusionsThe exon-flanking sequence pairs identified here by genomic analysis promote exon inclusion and may play a role in the exon definition step in pre-mRNA splicing. We propose a model in which multiple concerted interactions are required between exonic sequences and flanking intronic sequences to effect exon definition.
SUBMITTER: Ke S