Ylt1, a highly repetitive retrotransposon in the genome of the dimorphic fungus Yarrowia lipolytica.
ABSTRACT: A highly repetitive composite element, Ylt1, was detected in the genome of the dimorphic fungus Yarrowia lipolytica. Ylt1 resembles retrotransposons found in other eukaryotes. It is about 9.4 kb long and can transpose in the genome. The Ylt1 element is bounded by a long terminal repeat (LTR), the zeta element. Several copies of zeta were isolated and sequenced. The sequence of this element is well conserved. It is 714 bp long and is bounded by nucleotides 5'-TG...CA-3', which are part of a short inverted repeat, a feature conserved in the LTRs of retroviruses and retrotransposons. Sequence analysis revealed motifs commonly found in LTR elements, like signals for the start and termination of transcription. The zeta element exists as part of retrotransposon Ylt1, as well as a solo element in the genome. Ylt1 and solo zeta elements are flanked by a 4-bp directly repeated genomic sequence. The copy numbers of Ylt1 and solo zeta are dependent on the strain examined, but at least 35 copies of the composite Ylt1 element and more than 30 copies of the solo zeta element per haploid genome have been observed.
Project description:BACKGROUND: LTR retrotransposons are one of the main causes for plant genome size and structure evolution, along with polyploidy. The characterization of their amplification and subsequent elimination of the genomes is therefore a major goal in plant evolutionary genomics. To address the extent and timing of these forces, we performed a detailed analysis of 41 LTR retrotransposon families in rice. RESULTS: Using a new method to estimate the insertion date of both truncated and complete copies, we estimated these two forces more accurately than previous studies based on other methods. We show that LTR retrotransposons have undergone bursts of amplification within the past 5 My. These bursts vary both in date and copy number among families, revealing that each family has a particular amplification history. The number of solo LTR varies among families and seems to correlate with LTR size, suggesting that solo LTR formation is a family-dependent process. The deletion rate estimate leads to the prediction that the half-life of LTR retrotransposon sequences evolving neutrally is about 19 My in rice, suggesting that other processes than the formation of small deletions are prevalent in rice DNA removal. CONCLUSION: Our work provides insights into the dynamics of LTR retrotransposons in the rice genome. We show that transposable element families have distinct amplification patterns, and that the turn-over of LTR retrotransposons sequences is rapid in the rice genome.
Project description:The availability of the sequenced Drosophila melanogaster genome provides an opportunity to study sequence variation between copies within transposable element families. In this study,we analyzed the 624 copies of 22 transposable element (TE) families (14 LTR retrotransposons, five non-LTR retrotransposons, and three transposons). LTR and non-LTR retrotransposons possessed far fewer divergent elements than the transposons,suggesting that the difference depends on the transposition mechanism. However,there was not a continuous range of divergence of the copies in each class,which were either very similar to the canonical elements,or very divergent from them. This sequence homogeneity among TE family copies matches the theoretical models of the dynamics of these repeated sequences. The sequenced Drosophila genome thus appears to be composed of a mixture of TEs that are still active and of ancient relics that have degenerated and the distribution of which along the chromosomes results from natural selection. This clearly demonstrates that the TEs are highly active within the genome,suggesting that the genetic variability of the Drosophila genome is still being renewed by the action of TEs.
Project description:LTR retrotransposons constitute a significant part of plant genomes and their evolutionary dynamics play an important role in genome size changes. Current methods of LTR retrotransposon age estimation are based only on LTR (long terminal repeat) divergence. This has prompted us to analyze sequence similarity of LTRs in 25,144 LTR retrotransposons from fifteen plant species as well as formation of solo LTRs. We found that approximately one fourth of nested retrotransposons showed a higher LTR divergence than the pre-existing retrotransposons into which they had been inserted. Moreover, LTR similarity was correlated with LTR length. We propose that gene conversion can contribute to this phenomenon. Gene conversion prediction in LTRs showed potential converted regions in 25% of LTR pairs. Gene conversion was higher in species with smaller genomes while the proportion of solo LTRs did not change with genome size in analyzed species. The negative correlation between the extent of gene conversion and the abundance of solo LTRs suggests interference between gene conversion and ectopic recombination. Since such phenomena limit the traditional methods of LTR retrotransposon age estimation, we recommend an improved approach based on the exclusion of regions affected by gene conversion.
Project description:The complete DNA sequence of the genome of Schizosaccharomyces pombe provides the opportunity to investigate the entire complement of transposable elements (TEs), their association with specific sequences, their chromosomal distribution, and their evolution. Using homology-based sequence identification, we found that the sequenced strain of S. pombe contained only one family of full-length transposons. This family, Tf2, consisted of 13 full-length copies of a long terminal repeat (LTR) retrotransposon. We found that LTR-LTR recombination of previously existing transposons had resulted in extensive populations of solo LTRs. These included 35 solo LTRs of Tf2, as well as 139 solo LTRs from other Tf families. Phylogenetic analysis of solo Tf LTRs reveals that Tf1 and Tf2 were the most recently active elements within the genome. The solo LTRs also served as footprints for previous insertion events by the Tf retrotransposons. Analysis of 186 genomic insertion events revealed a close association with RNA polymerase II promoters. These insertions clustered in the promoter-proximal regions of genes, upstream of protein coding regions by 100 to 400 nucleotides. The association of Tf insertions with pol II promoters was very similar to the preference previously observed for Tf1 integration. We found that the recently active Tf elements were absent from centromeres and pericentromeric regions of the genome containing tandem tRNA gene clusters. In addition, our analysis revealed that chromosome III has twice the density of insertion events compared to the other two chromosomes. Finally we describe a novel repetitive sequence, wtf, which was also preferentially located on chromosome III, and was often located near solo LTRs of Tf elements.
Project description:It is becoming apparent that perhaps as much as half of the genome of the human blood fluke Schistosoma mansoni is constituted of mobile genetic element-related sequences. Non-long terminal repeat (LTR) retrotransposons, related to the LINE elements of mammals, comprise much of this repetitive component of the schistosome genome. Of more than 12 recognized clades of non-LTR retrotransposons, only members of the CR1, RTE, and R2 clades have been reported from the schistosome genome.Inspection of the nucleotide sequence of bacterial artificial chromosome number 49_J_14 from chromosome 1 of the genome of Schistosoma mansoni (GenBank AC093105) revealed the likely presence of several RTE-like retrotransposons. Among these, a new non-LTR retrotransposon designated SR3 was identified and is characterized here. Analysis of gene structure and phylogenetic analysis of both the reverse transcriptase and endonuclease domains of the mobile element indicated that SR3 represented a new family of RTE-like non-LTR retrotransposons. Remarkably, two full-length copies of SR3-like elements were present in BAC 49-J-14, and one of 3,211 bp in length appeared to be intact, indicating SR3 to be an active non-LTR retrotransposon. Both were flanked by target site duplications of 10-12 bp. Southern hybridization and bioinformatics analyses indicated the presence of numerous copies (probably >1,000) of SR3 interspersed throughout the genome of S. mansoni. Bioinformatics analyses also revealed SR3 to be transcribed in both larval and adult developmental stages of S. mansoni and to be also present in the genomes of the other major schistosome parasites of humans, Schistosoma haematobium and S. japonicum.Numerous copies of SR3, a novel non-LTR retrotransposon of the RTE clade are present in the genome of S. mansoni. Non-LTR retrotransposons of the RTE clade including SR3 appear to have been remarkably successful in colonizing, and proliferation within the schistosome genome.
Project description:Retroviruses and LTR retrotransposons comprise two long-terminal repeats (LTRs) bounding a central domain that encodes the products needed for reverse transcription, packaging, and integration into the genome. We describe a group of retrotransposons in 13 species and four genera of the grass tribe Triticeae, including barley, with long, approximately 4.4-kb LTRs formerly called Sukkula elements. The approximately 3.5-kb central domains include reverse transcriptase priming sites and are conserved in sequence but contain no open reading frames encoding typical retrotransposon proteins. However, they specify well-conserved RNA secondary structures. These features describe a novel group of elements, called LARDs or large retrotransposon derivatives (LARDs). These appear to be members of the gypsy class of LTR retrotransposons. Although apparently nonautonomous, LARDs appear to be transcribed and can be recombinationally mapped due to the polymorphism of their insertion sites. They are dispersed throughout the genome in an estimated 1.3 x 10(3) full-length copies and 1.16 x 10(4) solo LTRs, indicating frequent recombinational loss of internal domains as demonstrated also for the BARE-1 barley retrotransposon.
Project description:BACKGROUND: LTR Retrotransposons transpose through reverse transcription of an RNA intermediate and are ubiquitous components of all eukaryotic genomes thus far examined. Plant genomes, in particular, have been found to be comprised of a remarkably high number of LTR retrotransposons. There is a significant body of direct and indirect evidence that LTR retrotransposons have contributed to gene and genome evolution in plants. RESULTS: To explore the evolutionary history of long terminal repeat (LTR) retrotransposons and their impact on the genome of Oryza sativa, we have extended an earlier computer-based survey to include all identifiable full-length, fragmented and solo LTR elements in the rice genome database as of April 2002. A total of 1,219 retroelement sequences were identified, including 217 full-length elements, 822 fragmented elements, and 180 solo LTRs. In order to gain insight into the chromosomal distribution of LTR-retrotransposons in the rice genome, a detailed examination of LTR-retrotransposon sequences on Chromosome 10 was carried out. An average of 22.3 LTR-retrotransposons per Mb were detected in Chromosome 10. CONCLUSIONS: Gypsy-like elements were found to be >4 x more abundant than copia-like elements. Eleven of the thirty-eight investigated LTR-retrotransposon families displayed significant subfamily structure. We estimate that at least 46.5% of LTR-retrotransposons in the rice genome are older than the age of the species (< 680,000 years). LTR-retrotransposons present in the rice genome range in age from those just recently inserted up to nearly 10 million years old. Approximately 20% of LTR retrotransposon sequences lie within putative genes. The distribution of elements across chromosome 10 is non-random with the highest density (48 elements per Mb) being present in the pericentric region.
Project description:Background:Endogenous viral elements (EVEs) are sequences of viral origin integrated into the host genome. EVEs have been characterized in various insect genomes, including mosquitoes. A large EVE content has been found in Aedes aegypti and Aedes albopictus genomes among which a recently described Chuviridae viral family is of particular interest, owing to the abundance of EVEs derived from it, the discrepancy among the chuvirus endogenized gene regions and the frequent association with retrotransposons from the BEL-Pao superfamily. In order to better understand the endogenization process of chuviruses and the association between chuvirus glycoproteins and BEL-Pao retrotransposons, we performed a comparative genomics and evolutionary analysis of chuvirus-derived EVEs found in 37 mosquito genomes. Results:We identified 428 EVEs belonging to the Chuviridae family confirming the wide discrepancy among the chuvirus genomic regions endogenized: 409 glycoproteins, 18 RNA-dependent RNA polymerases and one nucleoprotein region. Most of the glycoproteins (263 out of 409) are associated specifically with retroelements from the Pao family. Focusing only on well-assembled Pao retroelement copies, we estimated that 263 out of 379 Pao elements are associated with chuvirus-derived glycoproteins. Seventy-three potentially active Pao copies were found to contain glycoproteins into their LTR boundaries. Thirteen out of these were classified as complete and likely autonomous copies, with a full LTR structure and protein domains. We also found 116 Pao copies with no trace of glycoproteins and 37 solo glycoproteins. All potential autonomous Pao copies, contained highly similar LTRs, suggesting a recent/current activity of these elements in the mosquito genomes. Conclusion:Evolutionary analysis revealed that most of the glycoproteins found are likely derived from a single or few glycoprotein endogenization events associated with a recombination event with a Pao ancestral element. A potential functional Pao-chuvirus hybrid (named Anakin) emerged and the glycoprotein was further replicated through retrotransposition. However, a number of solo glycoproteins, not associated with Pao elements, can be found in some mosquito genomes suggesting that these glycoproteins were likely domesticated by the host genome and may participate in an antiviral defense mechanism against both chuvirus and Anakin retrovirus.
Project description:We identified putative long terminal repeat- (LTR) retrotransposon sequences among the 50,000 random sequence tags (RSTs) obtained by the Génolevures project from genomic libraries of 13 Hemiascomycetes species. In most cases additional sequencing enabled us to assemble the whole sequences of these retrotransposons. These approaches identified 17 distinct families, 10 of which are defined by full-length elements. We also identified five families of solo LTRs that were not associated with retrotransposons. Ty1-like retrotransposons were found in four of five species that are phylogenetically related to Saccharomyces cerevisiae (S. uvarum, S. exiguus, S. servazzii, and S. kluyveri but not Zygosaccharomyces rouxii), and in two of three Kluyveromyces species (K. lactis and K. marxianus but not K. thermotolerans). Only multiply crippled elements could be identified in the K. lactis and S. servazzii strains analyzed, and only solo LTRs could be identified in S. uvarum. Ty4-like elements were only detected in S. uvarum, indicating that these elements appeared recently before speciation of the Saccharomyces sensu stricto species. Ty5-like elements were detected in S. exiguus, Pichia angusta, and Debaryomyces hansenii. A retrotransposon homologous with Tca2 from Candida albicans, an element absent from S. cerevisiae, was detected in the closely related species D. hansenii. A complete Ty3/gypsy element was present in S. exiguus, whereas only partial, often degenerate, sequences resembling this element were found in S. servazzii, Z. rouxii, S. kluyveri, C. tropicalis, and Yarrowica lipolytica. P. farinosa (syn. P. sorbitophila) is currently the only yeast species in which no LTR retrotransposons or remnants have been found. Thorough analysis of protein sequences, structural characteristics of the elements, and phylogenetic relationships deduced from these data allowed us to propose a classification for the Ty1/copia elements of hemiascomycetous yeasts and a model of LTR-retrotransposon evolution in yeasts.
Project description:BACKGROUND: Retrotransposons make a significant contribution to the size, organization and genetic diversity of their host genomes. To characterize retrotransposon families in the grapevine genome (the fourth crop plant genome sequenced) we have combined two approaches: a PCR-based method for the isolation of RnaseH-LTR sequences with a computer-based sequence similarity search in the whole-genome sequence of PN40024. RESULTS: Supported by a phylogenic analysis, ten novel Ty1/copia families were distinguished in this study. To select a canonical reference element sequence from amongst the various insertions in the genome belonging to each retroelement family, the following screening criteria were adopted to identify the element sequence with: (1) perfect 5 bp-duplication of target sites, (2) the highest level of identity between 5' and 3'-LTR within a single insertion sequence, and (3) longest, un-interrupted coding capacity within the gag-pol ORF. One to eight copies encoding a single putatively functional gag-pol polyprotein were found for three families, indicating that these families could be still autonomous and active. For the others, no autonomous copies were identified. However, a subset of copies within the presumably non-autonomous families had perfect identity between their 5' and 3' LTRs, indicating a recent insertion event. A phylogenic study based on the sequence alignment of the region located between reverse transcriptase domains I and VII distinguished these 10 families from other plant retrotransposons. Including the previously characterized Ty1/copia-like grapevine retrotransposons Tvv1 and Vine 1 and the Ty3/gypsy-like Gret1 in this assessment, a total of 1709 copies were identified for the 13 retrotransposon families, representing 1.24% of the sequenced genome. The copy number per family ranged from 91-212 copies. We performed insertion site profiling for 8 out of the 13 retrotransposon families and confirmed multiple insertions of these elements across the Vitis genus. Insertional polymorphism analysis and dating of full-length copies based on their LTR divergence demonstrated that each family has a particular amplification history, with 71% of the identified copies being inserted within the last 2 million years. CONCLUSION: The strategy we used efficiently delivered new Ty1/copia-like retrotransposon sequences, increasing the total number of characterized grapevine retrotrotransposons from 3 to 13. We provide insights into the representation and dynamics of the 13 families in the genome. Our data demonstrated that each family has a particular amplification pattern, with 7 families having copies recently inserted within the last 0.2 million year. Among those 7 families with recent insertions, three retain the capacity for activity in the grape genome today.