Dataset Information


Retrotransposons and their recognition of pol II promoters: a comprehensive survey of the transposable elements from the complete genome sequence of Schizosaccharomyces pombe.

ABSTRACT: The complete DNA sequence of the genome of Schizosaccharomyces pombe provides the opportunity to investigate the entire complement of transposable elements (TEs), their association with specific sequences, their chromosomal distribution, and their evolution. Using homology-based sequence identification, we found that the sequenced strain of S. pombe contained only one family of full-length transposons. This family, Tf2, consisted of 13 full-length copies of a long terminal repeat (LTR) retrotransposon. We found that LTR-LTR recombination of previously existing transposons had resulted in extensive populations of solo LTRs. These included 35 solo LTRs of Tf2, as well as 139 solo LTRs from other Tf families. Phylogenetic analysis of solo Tf LTRs reveals that Tf1 and Tf2 were the most recently active elements within the genome. The solo LTRs also served as footprints for previous insertion events by the Tf retrotransposons. Analysis of 186 genomic insertion events revealed a close association with RNA polymerase II promoters. These insertions clustered in the promoter-proximal regions of genes, upstream of protein coding regions by 100 to 400 nucleotides. The association of Tf insertions with pol II promoters was very similar to the preference previously observed for Tf1 integration. We found that the recently active Tf elements were absent from centromeres and pericentromeric regions of the genome containing tandem tRNA gene clusters. In addition, our analysis revealed that chromosome III has twice the density of insertion events compared to the other two chromosomes. Finally we describe a novel repetitive sequence, wtf, which was also preferentially located on chromosome III, and was often located near solo LTRs of Tf elements.


PROVIDER: S-EPMC403668 | BioStudies | 2003-01-01

REPOSITORIES: biostudies

Similar Datasets

1000-01-01 | S-EPMC362960 | BioStudies
2015-01-01 | S-EPMC4600332 | BioStudies
2012-01-01 | S-EPMC3457346 | BioStudies
2008-01-01 | S-EPMC2423209 | BioStudies
2016-01-01 | S-EPMC4905674 | BioStudies
2013-01-01 | S-EPMC3701649 | BioStudies
1000-01-01 | S-EPMC1470764 | BioStudies
2017-01-01 | S-EPMC5529029 | BioStudies
2020-01-01 | S-EPMC7251063 | BioStudies
2007-01-01 | S-EPMC1941750 | BioStudies