What Can Long Terminal Repeats Tell Us About the Age of LTR Retrotransposons, Gene Conversion and Ectopic Recombination?
ABSTRACT: LTR retrotransposons constitute a significant part of plant genomes and their evolutionary dynamics play an important role in genome size changes. Current methods of LTR retrotransposon age estimation are based only on LTR (long terminal repeat) divergence. This has prompted us to analyze sequence similarity of LTRs in 25,144 LTR retrotransposons from fifteen plant species as well as formation of solo LTRs. We found that approximately one fourth of nested retrotransposons showed a higher LTR divergence than the pre-existing retrotransposons into which they had been inserted. Moreover, LTR similarity was correlated with LTR length. We propose that gene conversion can contribute to this phenomenon. Gene conversion prediction in LTRs showed potential converted regions in 25% of LTR pairs. Gene conversion was higher in species with smaller genomes while the proportion of solo LTRs did not change with genome size in analyzed species. The negative correlation between the extent of gene conversion and the abundance of solo LTRs suggests interference between gene conversion and ectopic recombination. Since such phenomena limit the traditional methods of LTR retrotransposon age estimation, we recommend an improved approach based on the exclusion of regions affected by gene conversion.
Project description:<h4>Background</h4>LTR Retrotransposons transpose through reverse transcription of an RNA intermediate and are ubiquitous components of all eukaryotic genomes thus far examined. Plant genomes, in particular, have been found to be comprised of a remarkably high number of LTR retrotransposons. There is a significant body of direct and indirect evidence that LTR retrotransposons have contributed to gene and genome evolution in plants.<h4>Results</h4>To explore the evolutionary history of long terminal repeat (LTR) retrotransposons and their impact on the genome of Oryza sativa, we have extended an earlier computer-based survey to include all identifiable full-length, fragmented and solo LTR elements in the rice genome database as of April 2002. A total of 1,219 retroelement sequences were identified, including 217 full-length elements, 822 fragmented elements, and 180 solo LTRs. In order to gain insight into the chromosomal distribution of LTR-retrotransposons in the rice genome, a detailed examination of LTR-retrotransposon sequences on Chromosome 10 was carried out. An average of 22.3 LTR-retrotransposons per Mb were detected in Chromosome 10.<h4>Conclusions</h4>Gypsy-like elements were found to be >4 x more abundant than copia-like elements. Eleven of the thirty-eight investigated LTR-retrotransposon families displayed significant subfamily structure. We estimate that at least 46.5% of LTR-retrotransposons in the rice genome are older than the age of the species (< 680,000 years). LTR-retrotransposons present in the rice genome range in age from those just recently inserted up to nearly 10 million years old. Approximately 20% of LTR retrotransposon sequences lie within putative genes. The distribution of elements across chromosome 10 is non-random with the highest density (48 elements per Mb) being present in the pericentric region.
Project description:We report the first described non-plant family of TRIMs (terminal-repeat retrotransposons in miniature), which are small nonautonomous LTR retrotransposons, from the whole-genome sequence of the red harvester ant, Pogonomyrmex barbatus (Hymenoptera: Myrmicinae). Members of this retrotransposon family, named PbTRIM, have typical features of plant TRIMs in length and structure, although they share no overall sequence similarity. PbTRIM elements and their solo-LTRs are abundant in the host genome and exhibit an uneven distribution pattern. Elements are preferentially inserted into TA-rich regions with ATAT as the most common pattern of target site duplication (TSD). PbTRIM is most likely mobile as indicated by the young age of many complete elements, the high degree of sequence similarity among elements at different genomic locations, the abundance of elements in the host genome, and the presence of 4-bp target site duplications (TSDs) flanking the elements and solo-LTRs. Many PbTRIM elements and their solo-LTRs are located within or near genes, suggesting their potential roles in restructuring the host genes and genome. Database search, PCR and sequencing analysis revealed the presence of homologous PbTRIM elements in other ant species. The high sequence similarity between elements from distantly related ant species, the incongruence between the phylogenies of PbTRIM and its hosts, and the patchy distribution of the retroelement within the Myrmicinae subfamily indicate possible horizontal transfer events of the retroelement.
Project description:Retroviruses and LTR retrotransposons comprise two long-terminal repeats (LTRs) bounding a central domain that encodes the products needed for reverse transcription, packaging, and integration into the genome. We describe a group of retrotransposons in 13 species and four genera of the grass tribe Triticeae, including barley, with long, approximately 4.4-kb LTRs formerly called Sukkula elements. The approximately 3.5-kb central domains include reverse transcriptase priming sites and are conserved in sequence but contain no open reading frames encoding typical retrotransposon proteins. However, they specify well-conserved RNA secondary structures. These features describe a novel group of elements, called LARDs or large retrotransposon derivatives (LARDs). These appear to be members of the gypsy class of LTR retrotransposons. Although apparently nonautonomous, LARDs appear to be transcribed and can be recombinationally mapped due to the polymorphism of their insertion sites. They are dispersed throughout the genome in an estimated 1.3 x 10(3) full-length copies and 1.16 x 10(4) solo LTRs, indicating frequent recombinational loss of internal domains as demonstrated also for the BARE-1 barley retrotransposon.
Project description:BACKGROUND AND AIMS:Peanut (Arachis hypogaea) is an allotetraploid (AABB-type genome) of recent origin, with a genome of about 2·8 Gb and a high repetitive content. This study reports an analysis of the repetitive component of the peanut A genome using bacterial artificial chromosome (BAC) clones from A. duranensis, the most probable A genome donor, and the probable consequences of the activity of these elements since the divergence of the peanut A and B genomes. METHODS:The repetitive content of the A genome was analysed by using A. duranensis BAC clones as probes for fluorescence in situ hybridization (BAC-FISH), and by sequencing and characterization of 12 genomic regions. For the analysis of the evolutionary dynamics, two A genome regions are compared with their B genome homeologues. KEY RESULTS:BAC-FISH using 27 A. duranensis BAC clones as probes gave dispersed and repetitive DNA characteristic signals, predominantly in interstitial regions of the peanut A chromosomes. The sequences of 14 BAC clones showed complete and truncated copies of ten abundant long terminal repeat (LTR) retrotransposons, characterized here. Almost all dateable transposition events occurred <3·5 million years ago, the estimated date of the divergence of A and B genomes. The most abundant retrotransposon is Feral, apparently parasitic on the retrotransposon FIDEL, followed by Pipa, also non-autonomous and probably parasitic on a retrotransposon we named Pipoka. The comparison of the A and B genome homeologous regions showed conserved segments of high sequence identity, punctuated by predominantly indel regions without significant similarity. CONCLUSIONS:A substantial proportion of the highly repetitive component of the peanut A genome appears to be accounted for by relatively few LTR retrotransposons and their truncated copies or solo LTRs. The most abundant of the retrotransposons are non-autonomous. The activity of these retrotransposons has been a very significant driver of genome evolution since the evolutionary divergence of the A and B genomes.
Project description:The complete DNA sequence of the genome of Schizosaccharomyces pombe provides the opportunity to investigate the entire complement of transposable elements (TEs), their association with specific sequences, their chromosomal distribution, and their evolution. Using homology-based sequence identification, we found that the sequenced strain of S. pombe contained only one family of full-length transposons. This family, Tf2, consisted of 13 full-length copies of a long terminal repeat (LTR) retrotransposon. We found that LTR-LTR recombination of previously existing transposons had resulted in extensive populations of solo LTRs. These included 35 solo LTRs of Tf2, as well as 139 solo LTRs from other Tf families. Phylogenetic analysis of solo Tf LTRs reveals that Tf1 and Tf2 were the most recently active elements within the genome. The solo LTRs also served as footprints for previous insertion events by the Tf retrotransposons. Analysis of 186 genomic insertion events revealed a close association with RNA polymerase II promoters. These insertions clustered in the promoter-proximal regions of genes, upstream of protein coding regions by 100 to 400 nucleotides. The association of Tf insertions with pol II promoters was very similar to the preference previously observed for Tf1 integration. We found that the recently active Tf elements were absent from centromeres and pericentromeric regions of the genome containing tandem tRNA gene clusters. In addition, our analysis revealed that chromosome III has twice the density of insertion events compared to the other two chromosomes. Finally we describe a novel repetitive sequence, wtf, which was also preferentially located on chromosome III, and was often located near solo LTRs of Tf elements.
Project description:We initially analyzed 11 families of low- and middle-copy-number long terminal repeat (LTR) retrotransposons in rice to determine how their structures have diverged from their predicted ancestral forms. These elements, many highly fragmented, were identified on the basis of sequence homology and structural characteristics. The 11 families, totaling 1000 elements, have copy numbers ranging from 1 to 278. Less than one-quarter of these elements are intact, whereas the remaining are solo LTRs and variously truncated fragments. We also analyzed two highly repetitive families (Osr8 and Osr30) of LTR retrotransposons and observed the same results. Our data indicate that unequal homologous recombination and illegitimate recombination are primarily responsible for LTR-retrotransposon removal. Further analysis suggests that most of the detectable LTR retrotransposons in rice inserted less than 8 million years ago, and have now lost over two-thirds of their encoded sequences. Hence, we predict that the half-life of LTR-retrotransposon sequences in rice is less than 6 million years. Moreover, our data demonstrate that at least 22% (97 Mb) of the current rice genome is comprised of LTR-retrotransposon sequences, and that more than 190 Mb of LTR-retrotransposon sequences have been deleted from the rice genome in the last 8 million years.
Project description:We identified putative long terminal repeat- (LTR) retrotransposon sequences among the 50,000 random sequence tags (RSTs) obtained by the Génolevures project from genomic libraries of 13 Hemiascomycetes species. In most cases additional sequencing enabled us to assemble the whole sequences of these retrotransposons. These approaches identified 17 distinct families, 10 of which are defined by full-length elements. We also identified five families of solo LTRs that were not associated with retrotransposons. Ty1-like retrotransposons were found in four of five species that are phylogenetically related to Saccharomyces cerevisiae (S. uvarum, S. exiguus, S. servazzii, and S. kluyveri but not Zygosaccharomyces rouxii), and in two of three Kluyveromyces species (K. lactis and K. marxianus but not K. thermotolerans). Only multiply crippled elements could be identified in the K. lactis and S. servazzii strains analyzed, and only solo LTRs could be identified in S. uvarum. Ty4-like elements were only detected in S. uvarum, indicating that these elements appeared recently before speciation of the Saccharomyces sensu stricto species. Ty5-like elements were detected in S. exiguus, Pichia angusta, and Debaryomyces hansenii. A retrotransposon homologous with Tca2 from Candida albicans, an element absent from S. cerevisiae, was detected in the closely related species D. hansenii. A complete Ty3/gypsy element was present in S. exiguus, whereas only partial, often degenerate, sequences resembling this element were found in S. servazzii, Z. rouxii, S. kluyveri, C. tropicalis, and Yarrowica lipolytica. P. farinosa (syn. P. sorbitophila) is currently the only yeast species in which no LTR retrotransposons or remnants have been found. Thorough analysis of protein sequences, structural characteristics of the elements, and phylogenetic relationships deduced from these data allowed us to propose a classification for the Ty1/copia elements of hemiascomycetous yeasts and a model of LTR-retrotransposon evolution in yeasts.
Project description:Long terminal repeat retrotransposons (LTR-RTs) are the major DNA components in flowering plants. Most LTR-RTs contain dinucleotides 'TG' and 'CA' at the ends of the two LTRs. Here we report the structure, evolution, and propensity of a tomato atypical retrotransposon element (TARE1) with both LTRs starting as 'TA'. This family is also characterized by high copy numbers (354 copies), short LTR size (194 bp), extremely low ratio of solo LTRs to intact elements (0.05?1), recent insertion (most within 0.75?1.75 million years, Mys), and enrichment in pericentromeric region. The majority (83%) of the TARE1 elements are shared between S. lycopersicum and its wild relative S. pimpinellifolium, but none of them are found in potato. In the present study, we used shared LTR-RTs as molecular markers and estimated the divergence time between S. lycopersicum and S. pimpinellifolium to be <0.5 Mys. Phylogenetic analysis showed that the TARE1 elements, together with two closely related families, TARE2 and TGRE1, have formed a sub-lineage belonging to a Copia-like Ale lineage. Although TARE1 and TARE2 shared similar structural characteristics, the timing, scale, and activity of their amplification were found to be substantially different. We further propose a model wherein a single mutation from 'G' to 'A' in 3' LTR followed by amplification is responsible for the origin of TARE1, thus providing evidence that the proliferation of a spontaneous mutation can be mediated by the amplification of LTR-RTs at the level of RNA.
Project description:DRD1 is a SWI/SNF-like protein that cooperates with a plant-specific RNA polymerase, Pol IVb, to facilitate RNA-directed de novo methylation and silencing of homologous DNA. Screens to identify endogenous targets of this pathway in Arabidopsis revealed intergenic regions and plant genes located primarily in euchromatin. Many putative targets are near retrotransposon LTRs or other intergenic sequences that encode short RNAs, which might epigenetically regulate adjacent genes. Consistent with this, derepression of a solo LTR in drd1 and pol IVb mutants was accompanied by reduced cytosine methylation and transcriptional upregulation of neighboring sequences. The solo LTR and several other LTRs that flank reactivated targets are associated with euchromatic histone modifications but little or no H3K9 dimethylation, a hallmark of constitutive heterochromatin. By contrast, LTRs of retrotransposons that remain silent in the mutants despite reduced cytosine methylation lack euchromatic marks and have H3K9 dimethylation. We propose that DRD1 and Pol IVb establish a basal level of silencing that can potentially be reversed in euchromatin, and further reinforced in heterochromatin by other proteins that induce more stable modifications.
Project description:A highly repetitive composite element, Ylt1, was detected in the genome of the dimorphic fungus Yarrowia lipolytica. Ylt1 resembles retrotransposons found in other eukaryotes. It is about 9.4 kb long and can transpose in the genome. The Ylt1 element is bounded by a long terminal repeat (LTR), the zeta element. Several copies of zeta were isolated and sequenced. The sequence of this element is well conserved. It is 714 bp long and is bounded by nucleotides 5'-TG...CA-3', which are part of a short inverted repeat, a feature conserved in the LTRs of retroviruses and retrotransposons. Sequence analysis revealed motifs commonly found in LTR elements, like signals for the start and termination of transcription. The zeta element exists as part of retrotransposon Ylt1, as well as a solo element in the genome. Ylt1 and solo zeta elements are flanked by a 4-bp directly repeated genomic sequence. The copy numbers of Ylt1 and solo zeta are dependent on the strain examined, but at least 35 copies of the composite Ylt1 element and more than 30 copies of the solo zeta element per haploid genome have been observed.