Transcriptomics

Dataset Information

0

Long first exons and epigenetic marks distinguish conserved pachytene piRNA clusters from other mammalian genes


ABSTRACT: In the male germ cells of eutherian animals, 26–30-nt-long PIWI-interacting RNAs (piRNAs) emerge when spermatocytes enter the pachytene phase of meiosis. These pachytene piRNAs derive from ~100 discrete autosomal loci resemble canonical protein-coding genes and long non-coding RNA-producing genes—they are transcribed by RNA polymerase II, bearing 5´ caps and 3´ poly(A) tails, and their transcripts often contain introns that are removed before nuclear export and processing into piRNAs. However, it is unclear which genic and epigenetic features distinguish pachytene piRNA genes from other types of genes and dictate their germline-specific expression. We report that an unusually long first exon (≥ 10 kb) or a long gene absent of introns altogether is highly correlated with both the germline-specific production of piRNA precursor transcripts from mouse pachytene piRNA loci. We also found that these precursor transcripts are enriched in the binding by THOC1 (also known as HPR1) and THOC2, subunits of the THO complex critical for transcription elongation and nuclear export of mRNAs. Our integrative analysis of transcriptome, piRNA, and epigenome datasets across multiple species reveals that a long first exon is an evolutionarily conserved feature of pachytene piRNA loci. We further found that a highly methylated promoter, often containing a low or intermediate level of CG dinucleotides, correlates with germline expression and somatic silencing of pachytene piRNA loci.

ORGANISM(S): Mus musculus Macaca mulatta

PROVIDER: GSE147724 | GEO | 2020/10/24

REPOSITORIES: GEO

Similar Datasets

2019-03-28 | GSE126578 | GEO
2019-08-30 | GSE136603 | GEO
2020-12-07 | PXD018657 | Pride
2019-12-23 | GSE135791 | GEO
2019-07-29 | PXD014472 | Pride
2021-07-02 | PXD019674 | Pride
2021-04-15 | PXD019670 | Pride
2021-04-16 | PXD019671 | Pride
2015-09-01 | E-MTAB-1265 | biostudies-arrayexpress
2014-11-20 | E-GEOD-41976 | biostudies-arrayexpress