Novel Role of 3'UTR-Embedded Alu Elements as Facilitators of Processed Pseudogene Genesis and Host Gene Capture by Viral Genomes.
ABSTRACT: Since the discovery of the high abundance of Alu elements in the human genome, the interest for the functional significance of these retrotransposons has been increasing. Primate Alu and rodent Alu-like elements are retrotransposed by a mechanism driven by the LINE1 (L1) encoded proteins, the same machinery that generates the L1 repeats, the processed pseudogenes (PPs), and other retroelements. Apart from free Alu RNAs, Alus are also transcribed and retrotranscribed as part of cellular gene transcripts, generally embedded inside 3' untranslated regions (UTRs). Despite different proposed hypotheses, the functional implication of the presence of Alus inside 3'UTRs remains elusive. In this study we hypothesized that Alu elements in 3'UTRs could be involved in the genesis of PPs. By analyzing human genome data we discovered that the existence of 3'UTR-embedded Alu elements is overrepresented in genes source of PPs. In contrast, the presence of other retrotransposable elements in 3'UTRs does not show this PP linked overrepresentation. This research was extended to mouse and rat genomes and the results accordingly reveal overrepresentation of 3'UTR-embedded B1 (Alu-like) elements in PP parent genes. Interestingly, we also demonstrated that the overrepresentation of 3'UTR-embedded Alus is particularly significant in PP parent genes with low germline gene expression level. Finally, we provide data that support the hypothesis that the L1 machinery is also the system that herpesviruses, and possibly other large DNA viruses, use to capture host genes expressed in germline or somatic cells. Altogether our results suggest a novel role for Alu or Alu-like elements inside 3'UTRs as facilitators of the genesis of PPs, particularly in lowly expressed genes. Moreover, we propose that this L1-driven mechanism, aided by the presence of 3'UTR-embedded Alus, may also be exploited by DNA viruses to incorporate host genes to their viral genomes.
Project description:BACKGROUND: Abundant pseudogenes are a feature of mammalian genomes. Processed pseudogenes (PPs) are reverse transcribed from mRNAs. Recent molecular biological studies show that mammalian long interspersed element 1 (L1)-encoded proteins may have been involved in PP reverse transcription. Here, we present the first comprehensive analysis of human PPs using all known human genes as queries. RESULTS: The human genome was queried and 3,664 candidate PPs were identified. The most abundant were copies of genes encoding keratin 18, glyceraldehyde-3-phosphate dehydrogenase and ribosomal protein L21. A simple method was developed to estimate the level of nucleotide substitutions (and therefore the age) of PPs. A Poisson-like age distribution was obtained with a mean age close to that of the Alu repeats, the predominant human short interspersed elements. These data suggest a nearly simultaneous burst of PP and Alu formation in the genomes of ancestral primates. The peak period of amplification of these two distinct retrotransposons was estimated to be 40-50 million years ago. Concordant amplification of certain L1 subfamilies with PPs and Alus was observed. CONCLUSIONS: We suggest that a burst of formation of PPs and Alus occurred in the genome of ancestral primates. One possible mechanism is that proteins encoded by members of particular L1 subfamilies acquired an enhanced ability to recognize cytosolic RNAs in trans.
Project description:Despite numerous studies implicating Alu repeat elements in various diseases, there is sparse information available with respect to the potential functional and biological roles of the repeat elements in Type 1 diabetes (T1D). Therefore, we performed a genome-wide sequence analysis of T1D candidate genes to identify embedded Alu elements within these genes. We observed significant enrichment of Alu elements within the T1D genes (p-value < 10e-16), which highlights their importance in T1D. Functional annotation of T1D genes harboring Alus revealed significant enrichment for immune-mediated processes (p-value < 10e-6). We also identified eight T1D genes harboring inverted Alus (IRAlus) within their 3' untranslated regions (UTRs) that are known to regulate the expression of host mRNAs by generating double stranded RNA duplexes. Our in silico analysis predicted the formation of duplex structures by IRAlus within the 3'UTRs of T1D genes. We propose that IRAlus might be involved in regulating the expression levels of the host T1D genes.
Project description:Long INterspersed Elements (LINE-1s, L1s) are responsible for over one million retrotransposon insertions and 8000 processed pseudogenes (PPs) in the human genome. An active L1 encodes two proteins (ORF1p and ORF2p) that bind with L1 RNA and form L1-ribonucleoprotein particles (RNPs). Although it is believed that the RNA-binding property of ORF1p is critical to recruit other mobile RNAs to the RNP, the identity of recruited RNAs is largely unknown. Here, we used crosslinking and immunoprecipitation followed by deep sequencing to identify RNA components of L1-RNPs. Our results show that in addition to retrotransposed RNAs [L1, Alu and SINE-VNTR-Alu (SVA)], L1-RNPs are enriched with cellular mRNAs, which have PPs in the human genome. Using purified L1-RNPs, we show that PP-source RNAs preferentially serve as ORF2p templates in a reverse transcriptase assay. In addition, we find that exogenous ORF2p binds endogenous ORF1p, allowing reverse transcription of the same PP-source RNAs. These data demonstrate that interaction of a cellular RNA with the L1-RNP is an inside track to PP formation.
Project description:Non-coding RNAs from transposable elements of human genome are gaining prominence in modulating transcriptome dynamics. Alu elements, as exonized, edited and antisense components within same transcripts could create novel regulatory switches in response to different transcriptional cues. We provide the first evidence for co-occurrences of these events at transcriptome-wide scale through integrative analysis of data sets across diverse experimental platforms and tissues. This involved the following: (i) positional anchoring of Alu exonization events in the UTRs and CDS of 4663 transcript isoforms from RefSeq mRNAs and (ii) mapping on to them A?I editing events inferred from ?7 million ESTs from dbEST and antisense transcripts identified from virtual serial analysis of gene expression tags represented in Cancer Genome Anatomy Project next-generation sequencing data sets across 20 tissues. We observed significant enrichment of these events in the 3'UTR as well as positional preference within the embedded Alus. More than 300 genes had co-occurrence of all these events at the exon level and were significantly enriched in apoptosis and lysosomal processes. Further, we demonstrate functional evidence of such dynamic interactions between Alu-mediated events in a time series data from Integrated Personal Omics Profiling during recovery from a viral infection. Such 'single transcript-multiple fate' opportunity facilitated by Alu elements may modulate transcriptional response, especially during stress.
Project description:The Alu elements are conserved approximately 300-nucleotide-long repeat sequences that belong to the SINE family of retrotransposons found abundantly in primate genomes. Pairs of inverted Alu repeats in RNA can form duplex structures that lead to hyperediting by the ADAR enzymes, and at least 333 human genes contain such repeats in their 3'-UTRs. Here, we show that a pair of inverted Alus placed within the 3'-UTR of egfp reporter mRNA strongly represses EGFP expression, whereas a single Alu has little or no effect. Importantly, the observed silencing correlates with A-to-I RNA editing, nuclear retention of the mRNA and its association with the protein p54(nrb). Further, we show that inverted Alu elements can act in a similar fashion in their natural chromosomal context to silence the adjoining gene. For example, the Nicolin 1 gene expresses multiple mRNA isoforms differing in the 3'-UTR. One isoform that contains the inverted repeat is retained in the nucleus, whereas another lacking these sequences is exported to the cytoplasm. Taken together, these results support a novel role for Alu elements in human gene regulation.
Project description:BACKGROUND: RNA editing by adenosine to inosine deamination is a widespread phenomenon, particularly frequent in the human transcriptome, largely due to the presence of inverted Alu repeats and their ability to form double-stranded structures--a requisite for ADAR editing. While several hundred thousand editing sites have been identified within these primate-specific repeats, the function of Alu-editing has yet to be elucidated. RESULTS: We show that inverted Alu repeats, expressed in the primate brain, can induce site-selective editing in cis on sites located several hundred nucleotides from the Alu elements. Furthermore, a computational analysis, based on available RNA-seq data, finds that site-selective editing occurs significantly closer to edited Alu elements than expected. These targets are poorly edited upon deletion of the editing inducers, as well as in homologous transcripts from organisms lacking Alus. Sequences surrounding sites near edited Alus in UTRs, have been subjected to a lesser extent of evolutionary selection than those far from edited Alus, indicating that their editing generally depends on cis-acting Alus. Interestingly, we find an enrichment of primate-specific editing within encoded sequence or the UTRs of zinc finger-containing transcription factors. CONCLUSIONS: We propose a model whereby primate-specific editing is induced by adjacent Alu elements that function as recruitment elements for the ADAR editing enzymes. The enrichment of site-selective editing with potentially functional consequences on the expression of transcription factors indicates that editing contributes more profoundly to the transcriptomic regulation and repertoire in primates than previously thought.
Project description:With over one million copies, Alu elements are the most abundant repetitive elements in the human genome. When transcribed, interaction between two Alus that are in opposite orientation gives rise to double-stranded RNA (dsRNA). Although the presence of dsRNA in the cell was previously thought to only occur during viral infection, it is now known that cells express many endogenous small dsRNAs, such as short interfering RNA (siRNAs) and microRNA (miRNAs), which regulate gene expression. It is possible that long dsRNA structures formed from Alu elements influence gene expression. Here, we report that human mRNAs containing inverted Alu elements are present in the mammalian cytoplasm. The presence of these long intramolecular dsRNA structures within 3'-UTRs decreases translational efficiency, and although the structures undergo extensive editing in vivo, the effects on translation are independent of the presence of inosine. As inverted Alus are predicted to reside in >5% of human protein-coding genes, these intramolecular dsRNA structures are important regulators of gene expression.
Project description:Transposable elements (TEs) account for nearly one-half of the sequence content in the human genome, and de novo germline transposition into regulatory or coding sequences of protein-coding genes can cause heritable disorders. TEs are prevalent in and around protein-coding genes, providing an opportunity to impart regulation. Computational studies reveal that microRNA (miRNA) genes and miRNA target sites reside within TE sequences, but there is little experimental evidence supporting a role for TEs in the birth of miRNAs, or as platform for gene regulation by miRNAs. In this work, we validate miRNAs and target sites derived from TE families prevalent in the human genome, including the ancient long interspersed nuclear element 2 (LINE2/L2), mammalian-wide interspersed repeat (MIR) retrotransposons and the primate-specific Alu family. We show that genes with 3' untranslated region (3' UTR) MIR elements are enriched for let-7 targets and that these sites are conserved and responsive to let-7 expression. We also demonstrate that 3' UTR-embedded Alus are a source of miR-24 and miR-122 target sites and that a subset of active genomic Alus provide for de novo target site creation. Finally, we report that although the creation of miRNA genes by Alu elements is relatively uncommon relative to their overall genomic abundance, Alu-derived miR-1285-1 is efficiently processed from its genomic locus and regulates genes with target sites contained within homologous elements. Taken together, our data provide additional evidence for TEs as a source for miRNAs and miRNA target sites, with instances of conservation through the course of mammalian evolution.
Project description:<h4>Background</h4>Polymorphic <i>Alu</i> elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest <i>AluY</i> subfamilies, and most structural variant discovery efforts have focused on identifying <i>Alu</i> polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older <i>AluS</i> subfamily, whose peak activity was tens of millions of years ago. We annotate the <i>AluS</i> polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome.<h4>Results</h4>Of 52 previously reported polymorphic <i>AluS</i> elements ascertained for this study, 48 were confirmed to belong to the <i>AluS</i> subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic <i>AluS</i> elements (4%) have features of non-classical <i>Alu</i> insertions and one polymorphic <i>AluS</i> element (2%) likely inserted by a mechanism involving internal priming. Seven <i>AluS</i> polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements.<h4>Conclusions</h4>Of the polymorphic <i>AluS</i> elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some <i>AluS</i> elements have been more active recently than previously thought, or that fixation of <i>AluS</i> insertion alleles remains incomplete. These data expand the potential significance of polymorphic <i>AluS</i> elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic <i>AluS</i> elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted.
Project description:Alu elements are trans-mobilized by the autonomous non-LTR retroelement, LINE-1 (L1). Alu-induced insertion mutagenesis contributes to about 0.1% human genetic disease and is responsible for the majority of the documented instances of human retroelement insertion-induced disease. Here we introduce a SINE recovery method that provides a complementary approach for comprehensive analysis of the impact and biological mechanisms of Alu retrotransposition. Using this approach, we recovered 226 de novo tagged Alu inserts in HeLa cells. Our analysis reveals that in human cells marked Alu inserts driven by either exogenously supplied full length L1 or ORF2 protein are indistinguishable. Four percent of de novo Alu inserts were associated with genomic deletions and rearrangements and lacked the hallmarks of retrotransposition. In contrast to L1 inserts, 5' truncations of Alu inserts are rare, as most of the recovered inserts (96.5%) are full length. De novo Alus show a random pattern of insertion across chromosomes, but further characterization revealed an Alu insertion bias exists favoring insertion near other SINEs, highly conserved elements, with almost 60% landing within genes. De novo Alu inserts show no evidence of RNA editing. Priming for reverse transcription rarely occurred within the first 20 bp (most 5') of the A-tail. The A-tails of recovered inserts show significant expansion, with many at least doubling in length. Sequence manipulation of the construct led to the demonstration that the A-tail expansion likely occurs during insertion due to slippage by the L1 ORF2 protein. We postulate that the A-tail expansion directly impacts Alu evolution by reintroducing new active source elements to counteract the natural loss of active Alus and minimizing Alu extinction.