Transcriptomics

Dataset Information

0

Primary transcriptome map of the hyperthermophilic archaeon Thermococcus kodakarensis


ABSTRACT: Background: Prokaryotes have relatively small genomes, densely-packed and apparently dominated by protein-encoding sequences. However, data now generated by high throughput RNA sequencing (RNA-seq) reveal surprisingly more-complex transcriptomes with many previously unrecognized and unanticipated non-coding small and antisense transcripts. To date, such studies have investigated primarily Bacteria. Here, we report the transcripts present in Thermococcus kodakarensis, a model hyperthermophilic Archaeon, synthesized under different growth and metabolic conditions. Results: cDNA libraries, generated from RNA preparations isolated from cells growing in media with sulfur or pyruvate, with sulfur to stationary phase, and growing with pyruvate but with sulfur added 20 min before RNA isolation, have been deep-sequenced. The results identify >2,700 sites of transcription initiation, establish a genome-wide map of transcripts, and consensus sequences for transcription initiation and post-transcription regulatory elements in T. kodakarensis. Primary transcription start sites (TSS) are identified upstream of 1,254 annotated genes, including ~78 % of those predicted by promoter locations, and an additional 644 primary TSS and their promoters have been identified within genes. Most of the mRNAs have a 5'-untranslated region (5'-UTR) between 10 and 50 nt long (median length = 16 nt), ~20 % have 5'-UTRs from 50 to 300 nt long, ~14 % are leaderless with 5'-UTRs ≤8 nt, and ~50% contain a consensus ribosome binding sequence. The results also identify TSS for 1,018 antisense transcripts, most with sequences complementary to either the 5'- or 3'-region of a sense mRNA. The data confirm the presence of transcripts from all three CRISPR loci, the RNase P and 7S RNAs, all tRNAs and rRNAs and 69 snoRNAs predicted to be encoded in the T. kodakarensis genome. Two transcripts, putatively identified as riboswitches, were present in RNA preparations isolated from growing but not from stationary phase cells. The procedure used is designed to identify TSS but, assuming that the number of cDNA reads correlates with transcript abundance, the data obtained also provide a semi-quantitative overview of global operon expression. They document substantial differences in gene expression under different physiological conditions and are consistent with previous observations of substrate-dependent specific gene expression. Many previously unrecognized and unanticipated small RNAs have been identified, some with relative low GC contents (≤50%) and sequences that do not fold readily into base-paired secondary structures, contrary to the classical expectations for non-coding RNAs in a hyperthermophile. Conclusion: We have identified >2,700 TSS that include almost all of the primary sites of transcription initiation upstream of annotated genes, and also many secondary sites, sites within genes and sites resulting in antisense transcripts. The T. kodakarensis genome is small (~2.1 Mbp) and tightly packed with protein-encoding genes, but the results reveal the presence of many non-coding RNAs and predict extensive RNA-based regulation in T. kodakarensis.

ORGANISM(S): Thermococcus kodakarensis KOD1

PROVIDER: GSE56262 | GEO | 2014/08/18

SECONDARY ACCESSION(S): PRJNA242777

REPOSITORIES: GEO

Similar Datasets

2014-08-18 | E-GEOD-56262 | biostudies-arrayexpress
2016-08-10 | GSE82206 | GEO
2016-08-10 | E-GEOD-82206 | biostudies-arrayexpress
2013-12-07 | GSE53065 | GEO
2013-12-07 | E-GEOD-53065 | biostudies-arrayexpress
2016-01-01 | GSE71984 | GEO
2016-07-03 | E-GEOD-71984 | biostudies-arrayexpress
2014-10-16 | E-GEOD-57365 | biostudies-arrayexpress
2018-02-26 | GSE110413 | GEO
2018-03-23 | GSE107532 | GEO