Athila4 of Arabidopsis and Calypso of soybean define a lineage of endogenous plant retroviruses.
ABSTRACT: The Athila retroelements of Arabidopsis thaliana encode a putative envelope gene, suggesting that they are infectious retroviruses. Because most insertions are highly degenerate, we undertook a comprehensive analysis of the A. thaliana genome sequence to discern their conserved features. One family (Athila4) was identified whose members are largely intact and share >94% nucleotide identity. As a basis for comparison, related elements (the Calypso elements) were characterized from soybean. Consensus Calypso and Athila4 elements are 12-14 kb in length and have long terminal repeats of 1.3-1.8 kb. Gag and Pol are encoded on a single open reading frame (ORF) of 1801 (Calypso) and 1911 (Athila4) amino acids. Following the Gag-Pol ORF are noncoding regions of ~0.7 and 2 kb, which, respectively, flank the env-like gene. The env-like ORF begins with a putative splice acceptor site and encodes a protein with a predicted central transmembrane domain, similar to retroviral env genes. RNA of Athila elements was detected in an A. thaliana strain with decreased DNA methylation (ddm1). Additionally, a PCR survey identified related reverse transcriptases in diverse angiosperm genomes. Their ubiquitous nature and the potential for horizontal transfer by infection implicates these endogenous retroviruses as important vehicles for plant genome evolution.
Project description:Retrotransposons and retroviruses share similar intracellular life cycles and major encoded proteins, but retrotransposons lack the envelope (env) critical for infectivity. Retrotransposons are ubiquitous and abundant in plants and active retroviruses are known in animals. Although a few env-containing retroelements, gypsy-like Athila, Cyclops, and Calypso and copia-like SIRE-1, have been identified in plants, the general presence and functionality of the domain remains unclear. We show here that env-class elements are present throughout the flowering plants and are widely transcribed. Within the grasses, we show the transcription of the env domain itself for Bagy-2 and related retrotransposons, all members of the Athila group. Furthermore, Bagy-2 transcripts undergo splicing to generate a subgenomic env product as do those of retroviruses. Transcription and the polymorphism of their insertion sites in closely related barley cultivars suggests that at least some are propagationally active. The putative ENV polypeptides of Bagy-2 and rice Rigy-2 contain predicted leucine zipper and transmembrane domains typical of retroviral ENVs. These findings raise the prospect of active retroviral agents among the plants.
Project description:Tat1 was originally identified as an insertion near the Arabidopsis thaliana SAM1 gene. We provide evidence that Tat1 is a retrotransposon and that previously described insertions are solo long terminal repeats (LTRs) left behind after the deletion of coding regions of full-length elements. Three Tat1 insertions were characterized that have retrotransposon features, including a primer binding site complementary to an A. thaliana asparagine tRNA and an open reading frame (ORF) with approximately 44% amino acid sequence similarity to the gag protein of the Zea mays retrotransposon Zeon-1. Tat1 elements have large, polymorphic 3' noncoding regions that may contain transduced DNA sequences; a 477-base insertion in the 3' noncoding region of the Tat1-3 element contains part of a related retrotransposon and sequences similar to the nontranslated leader sequence of AT-P5C1, a gene for pyrroline-5-carboxylate reductase. Analysis of DNA sequences generated by the A. thaliana genome project identified 10 families of Ty3/gypsy retrotransposons, which share up to 51 and 62% amino-acid similarity to the ORFs of Tat1 and the A. thaliana Athila element, respectively. Phylogenetic analyses resolved the plant Ty3/gypsy elements into two lineages, one of which includes homologs of Tat1 and Athila. Four families of A. thaliana elements within the Tat/Athila lineage encode a conserved ORF after integrase at a position occupied by the envelope gene in retroviruses and in some insect Ty3/gypsy retrotransposons. Like retroviral envelope genes, this ORF encodes a transmembrane domain and, in some insertions, a putative secretory signal sequence. This suggests that Tat/Athila retrotransposons may produce enveloped virions and may be infectious.
Project description:The multiple sclerosis-associated retrovirus (MSRV) isolated from plasma of MS patients was found to be phylogenetically and experimentally related to human endogenous retroviruses (HERVs). To characterize the MSRV-related HERV family and to test the hypothesis of a replication-competent HERV, we have investigated the expression of MSRV-related sequences in healthy tissues. The expression of MSRV-related transcripts restricted to the placenta led to the isolation of overlapping cDNA clones from a cDNA library. These cDNAs spanned a 7.6-kb region containing gag, pol, and env genes; RU5 and U3R flanking sequences; a polypurine tract; and a primer binding site (PBS). As this PBS showed similarity to avian retrovirus PBSs used by tRNATrp, this new HERV family was named HERV-W. Several genomic elements were identified, one of them containing a complete HERV-W unit, spanning all cDNA clones. Elements of this multicopy family were not replication competent, as gag and pol open reading frames (ORFs) were interrupted by frameshifts and stop codons. A complete ORF putatively coding for an envelope protein was found both on the HERV-W DNA prototype and within an RU5-env-U3R polyadenylated cDNA clone. Placental expression of 8-, 3.1-, and 1.3-kb transcripts was observed, and a putative splicing strategy was described. The apparently tissue-restricted HERV-W long terminal repeat expression is discussed with respect to physiological and pathological contexts.
Project description:BACKGROUND: The chromosomes of higher plants are littered with retrotransposons that, in many cases, constitute as much as 80% of plant genomes. Long terminal repeat retrotransposons have been especially successful colonizers of the chromosomes of higher plants and examinations of their function, evolution, and dispersal are essential to understanding the evolution of eukaryotic genomes. In soybean, several families of retrotransposons have been identified, including at least two that, by virtue of the presence of an envelope-like gene, may constitute endogenous retroviruses. However, most elements are highly degenerate and are often sequestered in regions of the genome that sequencing projects initially shun. In addition, finding potentially functional copies from genomic DNA is rare. This study provides a mechanism to surmount these issues to generate a consensus sequence that can then be functionally and phylogenetically evaluated. RESULTS: Diaspora is a multicopy member of the Ty3-gypsy-like family of LTR retrotransposons and comprises at least 0.5% of the soybean genome. Although the Diaspora family is highly degenerate, and with the exception of this report, is not represented in the Genbank nr database, a full-length consensus sequence was generated from short overlapping sequences using a combination of experimental and in silico methods. Diaspora is 11,737 bp in length and contains a single 1892-codon ORF that encodes a gag-pol polyprotein. Phylogenetic analysis indicates that it is closely related to Athila and Calypso retroelements from Arabidopsis and soybean, respectively. These in turn form the framework of an endogenous retrovirus lineage whose members possess an envelope-like gene. Diaspora appears to lack any trace of this coding region. CONCLUSION: A combination of empirical sequencing and retrieval of unannotated Genome Survey Sequence database entries was successfully used to construct a full-length representative of the Diaspora family in Glycine max. Diaspora is presently the only fully characterized member of a lineage of putative plant endogenous retroviruses that contains virtually no trace of an extra coding region. The loss of an envelope-like coding domain suggests that non-infectious retrotransposons could swiftly evolve from infectious retroviruses, possibly by anomalous splicing of genomic RNA.
Project description:BACKGROUND:The genome of invertebrates is rich in retroelements which are structurally reminiscent of the retroviruses of vertebrates. Those containing three open reading frames (ORFs), including an env-like gene, may well be considered as endogenous retroviruses. Further support to this similarity has been provided by the ability of the env-like gene of DmeGypV (the Gypsy endogenous retrovirus of Drosophila melanogaster) to promote infection of Drosophila cells by a pseudotyped vertebrate retrovirus vector. RESULTS:To gain insights into their evolutionary story, a sample of thirteen insect endogenous retroviruses, which represents the largest sample analysed until now, was studied by computer-assisted comparison of the translated products of their gag, pol and env genes, as well as their LTR structural features. We found that the three phylogenetic trees based respectively on Gag, Pol and Env common motifs are congruent, which suggest a monophyletic origin for these elements. CONCLUSIONS:We showed that most of the insect endogenous retroviruses belong to a major clade group which can be further divided into two main subgroups which also differ by the sequence of their primer binding sites (PBS). We propose to name IERV-K and IERV-S these two major subgroups of Insect Endogenous Retro Viruses (or Insect ERrantiVirus, according to the ICTV nomenclature) which respectively use Lys and Ser tRNAs to prime reverse transcription.
Project description:Recently, we identified and classified 926 human endogenous retrovirus H (HERV-H)-like proviruses in the human genome. In this paper, we used the information to, in silico, reconstruct a putative ancestral HERV-H. A calculated consensus sequence was nearly open in all genes. A few manual adjustments resulted in a putative 9-kb HERV-H provirus with open reading frames (ORFs) in gag, pro, pol, and env. Long terminal repeats (LTRs) differed by 1.1%, indicating proximity to an integration event. The gag ORF was extended upstream of the normal myristylation start site. There was a long leader (including a "pre-gag" ORF) region positioned like the N terminus of murine leukemia virus (MLV) "glyco-Gag," potentially encoding a proline- and serine-rich domain remotely similar to MLV pp12. Another ORF, starting inside the 5' LTR, had no obvious similarity to known protein domains. Unlike other hitherto described gammaretroviruses, the reconstructed Gag had two zinc finger motifs. Alternative splicing of sequences related to the HERV-H consensus was confirmed using dbEST data. env transcripts were most prevalent in colon tumors, but also in normal testis. We found no evidence for full length env transcripts in the dbEST. HERV-H had a markedly skewed nucleotide composition, disfavoring guanine and favoring cytidine. We conclude that the HERV-H consensus shared a gene arrangement common to gammaretroviruses with gag separated by stop codon from pro-pol in the same reading frame, while env resides in another reading frame. There was also alternative splicing. HERV-H consensus yielded new insights in gammaretroviral evolution and will be useful as a model in studies on expression and function.
Project description:Selective pressure to maintain small genome size implies control of transposable elements, and most old classes of retrotransposons are indeed absent from the very compact genome of the tunicate Oikopleura dioica. Nonetheless, two families of retrotransposons are present, including the Tor elements. The gene organization within Tor elements is similar to that of LTR retrotransposons and retroviruses. In addition to gag and pol, many Tor elements carry a third gene encoding viral envelope-like proteins (Env) that may mediate infection. We show that the Tor family contains distinct classes of elements. In some classes, env mRNA is transcribed from the 5'LTR as in retroviruses. In others, env is transcribed from an additional promoter located downstream of the 5'LTR. Tor Env proteins are membrane-associated glycoproteins which exhibit some features of viral membrane fusion proteins. Whereas some elements are expressed in the adult testis, many others are specifically expressed in embryonic somatic cells adjacent to primordial germ cells. Such embryonic expression depends on determinants present in the Tor elements and not on their surrounding genomic environment. Our study shows that unusual modes of transcription and expression close to the germline may contribute to the proliferation of Tor elements.
Project description:The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates.
Project description:A complete endogenous type C viral genome has been isolated from a baboon genomic library. The provirus, Papio cynocephalus endogenous retrovirus (PcEV), is 8,572 nucleotides long, and 38 to 59 proviral copies per baboon genome are found. The PcEV provirus possesses the typical simple retroviral gene organization, including two long terminal repeats and genes encoding gag, pol, and env proteins. The open reading frames for gag-pol and env are complete but have premature stop codons or frameshift mutations. The primer binding site of PcEV is complementary to tRNAGly. The gag and pol genes of PcEV are closely related to those of the baboon endogenous virus (BaEV). The env coding region of PcEV is related to the env genes of type C retroviruses. This suggests that PcEV is one of the ancestors of BaEV contributing the type C gag-pol genome fragment to the type C/D recombinant virus BaEV. Earlier it was shown that another endogenous type D virus (simian endogenous retrovirus) provided the env gene for BaEV (A. C. van der Kuyl et al., J. Virol. 71:3666-3676, 1997).
Project description:We have determined the nucleotide sequence of a 7.5 kb full-size gypsy element from Drosophila subobscura strain H-271. Comparative analyses were carried out on the sequence and molecular structure of gypsy elements of D.subobscura (gypsyDs), D.melanogaster (gypsyDm) and D.virilis (gypsyDv). The three elements show a structure that maintains a common mechanism of expression. ORF1 and ORF2 show typical motifs of gag and pol genes respectively in the three gypsy elements and could encode functional proteins necessary for intracellular expansion. In the three ORF1 proteins an arginine-rich region was found which could constitute a RNA binding motif. The main differences among the gypsy elements are found in ORF3 (env-like gene); gypsyDm encodes functional env proteins, whereas gypsyDs and gypsyDv ORF3s lack some motifs essential for functionality of this protein. On the basis of these results, while gypsyDm is the first insect retrovirus described, gypsyDs and gypsyDv could constitute degenerate forms of these retroviruses. In this context, we have found some evidence that gypsyDm could have recently infected some D.subobscura strains. Comparative analyses of divergence and phylogenetic relationships of gypsy elements indicate that the gypsy elements belonging to species of different subgenera (gypsyDs and gypsyDv) are closer than gypsy elements of species belonging to the same subgenus (gypsyDs and gypsyDm). These data are congruent with horizontal transfer of gypsy elements among different Drosophila spp.