A Survey of Transposon Landscapes in the Putative Ancient Asexual Ostracod Darwinula stevensoni.
ABSTRACT: How asexual reproduction shapes transposable element (TE) content and diversity in eukaryotic genomes remains debated. We performed an initial survey of TE load and diversity in the putative ancient asexual ostracod Darwinula stevensoni. We examined long contiguous stretches of DNA in clones from a genomic fosmid library, totaling about 2.5 Mb, and supplemented these data with results on TE abundance and diversity from an Illumina draft genome. In contrast to other TE studies in putatively ancient asexuals, which revealed relatively low TE content, we found that at least 19% of the fosmid dataset and 26% of the genome assembly corresponded to known transposons. We observed a high diversity of transposon families, including LINE, gypsy, PLE, mariner/Tc, hAT, CMC, Sola2, Ginger, Merlin, Harbinger, MITEs and helitrons, with the prevalence of DNA transposons. The predominantly low levels of sequence diversity indicate that many TEs are or have recently been active. In the fosmid data, no correlation was found between telomeric repeats and non-LTR retrotransposons, which are present near telomeres in other taxa. Most TEs in the fosmid data were located outside of introns and almost none were found in exons. We also report an N-terminal Myb/SANT-like DNA-binding domain in site-specific R4/Dong non-LTR retrotransposons. Although initial results on transposable loads need to be verified with high quality draft genomes, this study provides important first insights into TE dynamics in putative ancient asexual ostracods.
Project description:The availability of the sequenced Drosophila melanogaster genome provides an opportunity to study sequence variation between copies within transposable element families. In this study,we analyzed the 624 copies of 22 transposable element (TE) families (14 LTR retrotransposons, five non-LTR retrotransposons, and three transposons). LTR and non-LTR retrotransposons possessed far fewer divergent elements than the transposons,suggesting that the difference depends on the transposition mechanism. However,there was not a continuous range of divergence of the copies in each class,which were either very similar to the canonical elements,or very divergent from them. This sequence homogeneity among TE family copies matches the theoretical models of the dynamics of these repeated sequences. The sequenced Drosophila genome thus appears to be composed of a mixture of TEs that are still active and of ancient relics that have degenerated and the distribution of which along the chromosomes results from natural selection. This clearly demonstrates that the TEs are highly active within the genome,suggesting that the genetic variability of the Drosophila genome is still being renewed by the action of TEs.
Project description:We analyzed the distribution of 54 families of transposable elements (TEs; transposons, LTR retrotransposons, and non-LTR retrotransposons) in the chromosomes of Drosophila melanogaster, using data from the sequenced genome. The density of LTR and non-LTR retrotransposons (RNA-based elements) was high in regions with low recombination rates, but there was no clear tendency to parallel the recombination rate. However, the density of transposons (DNA-based elements) was significantly negatively correlated with recombination rate. The accumulation of TEs in regions of reduced recombination rate is compatible with selection acting against TEs, as selection is expected to be weaker in regions with lower recombination. The differences in the relationship between recombination rate and TE density that exist between chromosome arms suggest that TE distribution depends on specific characteristics of the chromosomes (chromatin structure, distribution of other sequences), the TEs themselves (transposition mechanism), and the species (reproductive system, effective population size, etc.), that have differing influences on the effect of natural selection acting against the TE insertions.
Project description:The interactions between transposable elements (TEs) and their hosts constitute one of the most profound co-evolutionary processes found in nature. The population dynamics of TEs depends on factors specific to each TE families, such as the rate of transposition and insertional preference, the demographic history of the host and the genomic landscape. How these factors interact has yet to be investigated holistically. Here we are addressing this question in the green anole (Anolis carolinensis) whose genome contains an extraordinary diversity of TEs (including non-LTR retrotransposons, SINEs, LTR-retrotransposons and DNA transposons). We observed a positive correlation between recombination rate and frequency of TEs and densities for LINEs, SINEs and DNA transposons. For these elements, there was a clear impact of demography on TE frequency and abundance, with a loss of polymorphic elements and skewed frequency spectra in recently expanded populations. On the other hand, some LTR-retrotransposons displayed patterns consistent with a very recent phase of intense amplification. To determine how demography, genomic features and intrinsic properties of TEs interact we ran simulations using SLiM3. We determined that i) short TE insertions are not strongly counter-selected, but long ones are, ii) neutral demographic processes, linked selection and preferential insertion may explain positive correlations between average TE frequency and recombination, iii) TE insertions are unlikely to have been massively recruited in recent adaptation. We demonstrate that deterministic and stochastic processes have different effects on categories of TEs and that a combination of empirical analyses and simulations can disentangle these mechanisms.
Project description:Background:Unicellular species make up the majority of eukaryotic diversity, however most studies on transposable elements (TEs) have centred on multicellular host species. Such studies may have therefore provided a limited picture of how transposable elements evolve across eukaryotes. The choanoflagellates, as the sister group to Metazoa, are an important study group for investigating unicellular to multicellular transitions. A previous survey of the choanoflagellate Monosiga brevicollis revealed the presence of only three families of LTR retrotransposons, all of which appeared to be active. Salpingoeca rosetta is the second choanoflagellate to have its whole genome sequenced and provides further insight into the evolution and population biology of transposable elements in the closest relative of metazoans. Results:Screening the genome revealed the presence of a minimum of 20 TE families. Seven of the annotated families are DNA transposons and the remaining 13 families are LTR retrotransposons. Evidence for two putative non-LTR retrotransposons was also uncovered, but full-length sequences could not be determined. Superfamily phylogenetic trees indicate that vertical inheritance and, in the case of one family, horizontal transfer have been involved in the evolution of the choanoflagellates TEs. Phylogenetic analyses of individual families highlight recent element activity in the genome, however six families did not show evidence of current transposition. The majority of families possess young insertions and the expression levels of TE genes vary by four orders of magnitude across families. In contrast to previous studies on TEs, the families present in S. rosetta show the signature of selection on codon usage, with families favouring codons that are adapted to the host translational machinery. Selection is stronger in LTR retrotransposons than DNA transposons, with highly expressed families showing stronger codon usage bias. Mutation pressure towards guanosine and cytosine also appears to contribute to TE codon usage. Conclusions:S. rosetta increases the known diversity of choanoflagellate TEs and the complement further highlights the role of horizontal gene transfer from prey species in choanoflagellate genome evolution. Unlike previously studied TEs, the S. rosetta families show evidence for selection on their codon usage, which is shown to act via translational efficiency and translational accuracy.
Project description:BACKGROUND:Transposable elements (TEs) have the potential to impact genome structure, function and evolution in profound ways. In order to understand the contribution of transposable elements (TEs) to Heliconius melpomene, we queried the H. melpomene draft sequence to identify repetitive sequences. RESULTS:We determined that TEs comprise ~25% of the genome. The predominant class of TEs (~12% of the genome) was the non-long terminal repeat (non-LTR) retrotransposons, including a novel SINE family. However, this was only slightly higher than content derived from DNA transposons, which are diverse, with several families having mobilized in the recent past. Compared to the only other well-studied lepidopteran genome, Bombyx mori, H. melpomene exhibits a higher DNA transposon content and a distinct repertoire of retrotransposons. We also found that H. melpomene exhibits a high rate of TE turnover with few older elements accumulating in the genome. CONCLUSIONS:Our analysis represents the first complete, de novo characterization of TE content in a butterfly genome and suggests that, while TEs are able to invade and multiply, TEs have an overall deleterious effect and/or that maintaining a small genome is advantageous. Our results also hint that analysis of additional lepidopteran genomes will reveal substantial TE diversity within the group.
Project description:Repbase is a comprehensive database of eukaryotic transposable elements (TEs) and repeat sequences, containing over 1300 human repeat sequences. Recent analyses of these repeat sequences have accumulated evidences for their contribution to human evolution through becoming functional elements, such as protein-coding regions or binding sites of transcriptional regulators. However, resolving the origins of repeat sequences is a challenge, due to their age, divergence, and degradation. Ancient repeats have been continuously classified as TEs by finding similar TEs from other organisms. Here, the most comprehensive picture of human repeat sequences is presented. The human genome contains traces of 10 clades (L1, CR1, L2, Crack, RTE, RTEX, R4, Vingi, Tx1 and Penelope) of non-long terminal repeat (non-LTR) retrotransposons (long interspersed elements, LINEs), 3 types (SINE1/7SL, SINE2/tRNA, and SINE3/5S) of short interspersed elements (SINEs), 1 composite retrotransposon (SVA) family, 5 classes (ERV1, ERV2, ERV3, Gypsy and DIRS) of LTR retrotransposons, and 12 superfamilies (Crypton, Ginger1, Harbinger, hAT, Helitron, Kolobok, Mariner, Merlin, MuDR, P, piggyBac and Transib) of DNA transposons. These TE footprints demonstrate an evolutionary continuum of the human genome.
Project description:BACKGROUND:Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. RESULTS:We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. CONCLUSIONS:The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.
Project description:<h4>Background</h4>Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression.<h4>Results</h4>Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (? 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution.<h4>Conclusions</h4>The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated.
Project description:Avian genomes have perplexed researchers by being conservative in both size and rearrangements, while simultaneously holding the blueprints for a massive species radiation during the last 65 million years (My). Transposable elements (TEs) in bird genomes are relatively scarce but have been implicated as important hotspots for chromosomal inversions. In zebra finch (Taeniopygia guttata), long terminal repeat (LTR) retrotransposons have proliferated and are positively associated with chromosomal breakpoint regions. Here, we present the genome, karyotype and transposons of blue-capped cordon-bleu (Uraeginthus cyanocephalus), an African songbird that diverged from zebra finch at the root of estrildid finches 10 million years ago (Mya). This constitutes the third linked-read sequenced genome assembly and fourth in-depth curated TE library of any bird. Exploration of TE diversity on this brief evolutionary timescale constitutes a considerable increase in resolution for avian TE biology and allowed us to uncover 4.5 Mb more LTR retrotransposons in the zebra finch genome. In blue-capped cordon-bleu, we likewise observed a recent LTR accumulation indicating that this is a shared feature of Estrildidae. Curiously, we discovered 25 new endogenous retrovirus-like LTR retrotransposon families of which at least 21 are present in zebra finch but were previously undiscovered. This highlights the importance of studying close relatives of model organisms.
Project description:Asparagus officinalis is an economically and nutritionally important vegetable crop that is widely cultivated and is used as a model dioecious species to study plant sex determination and sex chromosome evolution. To improve our understanding of its genome composition, especially with respect to transposable elements (TEs), which make up the majority of the genome, we performed Illumina HiSeq2000 sequencing of both male and female asparagus genomes followed by bioinformatics analysis. We generated 17 Gb of sequence (12×coverage) and assembled them into 163,406 scaffolds with a total cumulated length of 400 Mbp, which represent about 30% of asparagus genome. Overall, TEs masked about 53% of the A. officinalis assembly. Majority of the identified TEs belonged to LTR retrotransposons, which constitute about 28% of genomic DNA, with Ty1/copia elements being more diverse and accumulated to higher copy numbers than Ty3/gypsy. Compared with LTR retrotransposons, non-LTR retrotransposons and DNA transposons were relatively rare. In addition, comparison of the abundance of the TE groups between male and female genomes showed that the overall TE composition was highly similar, with only slight differences in the abundance of several TE groups, which is consistent with the relatively recent origin of asparagus sex chromosomes. This study greatly improves our knowledge of the repetitive sequence construction of asparagus, which facilitates the identification of TEs responsible for the early evolution of plant sex chromosomes and is helpful for further studies on this dioecious plant.