Identification of novel subgenomic RNAs and noncanonical transcription initiation signals of severe acute respiratory syndrome coronavirus.
ABSTRACT: The expression of the genomic information of severe acute respiratory syndrome coronavirus (SARS CoV) involves synthesis of a nested set of subgenomic RNAs (sgRNAs) by discontinuous transcription. In SARS CoV-infected cells, 10 sgRNAs, including 2 novel ones, were identified, which were predicted to be functional in the expression of 12 open reading frames located in the 3' one-third of the genome. Surprisingly, one new sgRNA could lead to production of a truncated spike protein. Sequence analysis of the leader-body fusion sites of each sgRNA showed that the junction sequences and the corresponding transcription-regulatory sequence (TRS) are unique for each species of sgRNA and are consistent after virus passages. For the two novel sgRNAs, each used a variant of the TRS that has one nucleotide mismatch in the conserved hexanucleotide core (ACGAAC) in the TRS. Coexistence of both plus and minus strands of SARS CoV sgRNAs and evidence for derivation of the sgRNA core sequence from the body core sequence favor the model of discontinuous transcription during minus-strand synthesis. Moreover, one rare species of sgRNA has the junction sequence AAA, indicating that its transcription could result from a noncanonical transcription signal. Taken together, these results provide more insight into the molecular mechanisms of genome expression and subgenomic transcription of SARS CoV.
Project description:The 3'-one-third of the severe acute respiratory syndrome coronavirus (SARS-CoV) genome contains genes for four essential structural proteins and eight virus-specific genes. The expression of this genomic information of SARS-CoV involves synthesis of a nested set of subgenomic RNAs (sgRNAs). In this study, we showed that the translational levels of 10 SARS-CoV sgRNAs including the two low-abundance sgRNAs 2-1 and 3-1 varied considerably in translation reporter assays. We also demonstrated that the initiator AUG codon of sgRNA-8 was silent and the repressive control was most likely positioned in the upstream untranslated region (UTR) of sgRNA-8. The initiator AUG codons of most sgRNAs are in poor Kozak contexts and the translation of truncated proteins from downstream AUG codons by leaky scanning was common in our experimental settings. No significant correlation was found between complexity of 5'-UTR and the sequence context of AUG codon with the level of translation of SARS-CoV sgRNAs. These results will be helpful for further studies to reveal the biological functions and translation regulatory mechanisms of sgRNAs in the coronavirus life cycle and pathogenesis.
Project description:Hibiscus chlorotic ringspot virus (HCRSV), which belongs to the genus Carmovirus, generates two 3'-coterminal subgenomic RNAs (sgRNAs) of 1.4 kb and 1.7 kb. Transcription start sites of the two sgRNAs were identified at nucleotides (nt) 2178 and 2438, respectively. The full promoter of sgRNA1, a 118-base sequence, is localized between positions +6 and -112 relative to its transcription start site (+1). Similarly, a 132-base sequence, from +6 to -126, defines the sgRNA2 promoter. Computer analysis revealed that both sgRNA promoters share a similar two-stem-loop (SL1 + SL2) structure, immediately upstream of the transcription start site. Mutational analysis of the primary sequence and secondary structures showed further similarities between the two subgenomic promoters. The basal portion of SL2, encompassing the transcription start site, was essential for transcription activity in each promoter, while SL1 and the upper portion of SL2 played a role in transcription enhancement. Both the 5' untranslated region (UTR) and the last 87 nt at the 3' UTR of HCRSV genomic RNA are likely to be the putative genomic plus-strand and minus-strand promoters, respectively. They function well as individual sgRNA promoters to produce ectopic subgenomic RNAs in vivo but not to the same levels of the actual sgRNA promoters. This suggests that HCRSV sgRNA promoters share common features with the promoters for genomic plus-strand and minus-strand RNA synthesis. To our knowledge, this is the first demonstration that both the 5' UTR and part of the 3' UTR can be duplicated and function as sgRNA promoters within a single viral genome.
Project description:Coronavirus (CoV) transcription includes a discontinuous mechanism during the synthesis of sub-genome-length minus-strand RNAs leading to a collection of mRNAs in which the 5' terminal leader sequence is fused to contiguous genome sequences. It has been previously shown that transcription-regulating sequences (TRSs) preceding each gene regulate transcription. Base pairing between the leader TRS (TRS-L) and the complement of the body TRS (cTRS-B) in the nascent RNA is a determinant factor during CoV transcription. In fact, in transmissible gastroenteritis CoV, a good correlation has been observed between subgenomic mRNA (sg mRNA) levels and the free energy (DeltaG) of TRS-L and cTRS-B duplex formation. The only exception was sg mRNA N, the most abundant sg mRNA during viral infection in spite of its minimum DeltaG associated with duplex formation. We postulated that additional factors should regulate transcription of sg mRNA N. In this report, we have described a novel transcription regulation mechanism operating in CoV by which a 9-nucleotide (nt) sequence located 449 nt upstream of the N gene TRS core sequence (CS-N) interacts with a complementary sequence just upstream of CS-N, specifically increasing the accumulation of sg mRNA N. Alteration of this complementarity in mutant replicon genomes showed a correlation between the predicted stability of the base pairing between 9-nt sequences and the accumulation of sg mRNA N. This interaction is exclusively conserved in group 1a CoVs, the only CoV subgroup in which the N gene is not the most 3' gene in the viral genome. This is the first time that a long-distance RNA-RNA interaction regulating transcriptional activity specifically enhancing the transcription of one gene has been described to occur in CoVs.
Project description:Transmissible gastroenteritis coronavirus (TGEV) genomic RNA transcription generates 5'- and 3'-coterminal subgenomic mRNAs. This process involves a discontinuous step during the synthesis of minus-sense RNA that is modulated by transcription-regulating sequences located at the 3' end of the leader (TRS-L) and also preceding each viral gene (TRS-Bs). TRSs include a highly conserved core sequence (CS) (5'-CUAAAC-3') and variable flanking sequences. It has been previously proposed that TRS-Bs act as attenuation or stop signals during the synthesis of minus-sense RNAs. The nascent minus-stranded RNA would then be transferred by a template switch process to the TRS-L, which acts as the acceptor RNA. To study whether the TRS-L is structured and to determine whether this structure has a functional impact on genomic and subgenomic viral RNA synthesis, we have used a combination of nuclear magnetic resonance (NMR) spectroscopy and UV thermal denaturation approaches together with site-directed mutagenesis and in vivo transcriptional analyses. The results indicated that a 36-nucleotide oligomer encompassing the wild-type TRS-L forms a structured hairpin closed by an apical AACUAAA heptaloop. This loop contains most of the CS and is isolated from a nearby internal loop by a short Watson-Crick base-paired stem. TRS-L mutations altering the structure and the stability of the TRS-L hairpin affected replication and transcription, indicating the requirement of a functional RNA hairpin structure in these processes.
Project description:To generate an extensive set of subgenomic (sg) mRNAs, nidoviruses (arteriviruses and coronaviruses) use a mechanism of discontinuous transcription. During this process, mRNAs are generated that represent the genomic 5' sequence, the so-called leader RNA, fused at specific positions to different 3' regions of the genome. The fusion of the leader to the mRNA bodies occurs at a short, conserved sequence element, the transcription-regulating sequence (TRS), which precedes every transcription unit in the genome and is also present at the 3' end of the leader sequence. Here, we have used site-directed mutagenesis of the infectious cDNA clone of the arterivirus equine arteritis virus to show that sg mRNA synthesis requires a base-pairing interaction between the leader TRS and the complement of a body TRS in the viral negative strand. Mutagenesis of the body TRS of equine arteritis virus RNA7 reduced sg RNA7 transcription severely or abolished it completely. Mutations in the leader TRS dramatically influenced the synthesis of all sg mRNAs. The construction of double mutants in which a mutant leader TRS was combined with the corresponding mutant RNA7 body TRS resulted in the specific restoration of mRNA7 synthesis. The analysis of the mRNA leader-body junctions of a number of mutants with partial transcriptional activity provided support for a mechanism of discontinuous minus-strand transcription that resembles similarity-assisted, copy-choice RNA recombination.
Project description:Citrus tristeza virus (CTV), a member of the Closteroviridae, has a positive-sense RNA genome of about 20 kb organized into 12 open reading frames (ORFs). The last 10 ORFs are expressed through 3'-coterminal subgenomic RNAs (sgRNAs) regulated in both amounts and timing. Additionally, relatively large amounts of complementary sgRNAs are produced. We have been unable to determine whether these sgRNAs are produced by internal promotion from the full-length template minus strand or by transcription from the minus-stranded sgRNAs. Understanding the regulation of 10 sgRNAs is a conceptual challenge. In analyzing commonalities of a replicase complex in producing so many sgRNAs, we examined initiating nucleotides of the sgRNAs. We mapped the 5' termini of intermediate- (CP and p13) and low- (p18) produced sgRNAs that, like the two highly abundant sgRNAs (p20 and p23) previously mapped, all initiate with an adenylate. We then examined modifications of the initiation site, which has been shown to be useful in defining mechanisms of sgRNA synthesis. Surprisingly, mutation of the initiating nucleotide of the CTV sgRNAs did not prevent sgRNA accumulation. Based on our results, the CTV replication complex appears to initiate sgRNA synthesis with purines, preferably with adenylates, and is able to initiate synthesis using a nucleotide a few positions 5' or 3' of the native initiation nucleotide. Furthermore, the context of the initiation site appears to be a regulatory mechanism for levels of sgRNA production. These data do not support either of the established mechanisms for synthesis of sgRNAs, suggesting that CTV sgRNA production utilizes a different mechanism.
Project description:The generation of subgenomic mRNAs in coronavirus involves a discontinuous mechanism of transcription by which the common leader sequence, derived from the genome 5' terminus, is fused to the 5' end of the mRNA coding sequence (body). Transcription-regulating sequences (TRSs) precede each gene and include a conserved core sequence (CS) surrounded by relatively variable sequences (5' TRS and 3' TRS). Regulation of transcription in coronaviruses has been studied by reverse-genetics analysis of the sequences immediately flanking a unique CS in the Transmissible gastroenteritis virus genome (CS-S2), located inside the S gene, that does not lead to detectable amounts of the corresponding mRNA, in spite of its canonical sequence. The transcriptional inactivity of CS-S2 was genome position independent. The presence of a canonical CS was not sufficient to drive transcription, but subgenomic synthesis requires a minimum base pairing between the leader TRS (TRS-L) and the complement of the body TRS (cTRS-B) provided by the CS and its adjacent nucleotides. A good correlation was observed between the free energy of TRS-L and cTRS-B duplex formation and the levels of subgenomic mRNA S2, demonstrating that base pairing between the leader and body beyond the CS is a determinant regulation factor in coronavirus transcription. In TRS mutants with increasing complementarity between TRS-L and cTRS-B, a tendency to reach a plateau in DeltaG values was observed, suggesting that a more precise definition of the TRS limits might be proposed, specifically that it consists of the central CS and around 4 nucleotides flanking 5' and 3' the CS. Sequences downstream of the CS exert a stronger influence on the template-switching decision according to a model of polymerase strand transfer and template switching during minus-strand synthesis.
Project description:The SARS-CoV-2 coronavirus is driving a global pandemic, but its biological mechanisms are less well understood. SARS-CoV-2 is an RNA virus whose multiple genomic and subgenomic RNA (sgRNA) transcripts hijack the host cell's machinery, located across distinct cytotopic locations. Subcellular localization of its viral RNA could play important roles in viral replication and host antiviral immune response. Here we perform computational modeling of SARS-CoV-2 viral RNA localization across eight subcellular neighborhoods. We compare hundreds of SARS-CoV-2 genomes to the human transcriptome and other coronaviruses and perform systematic sub-sequence analyses to identify the responsible signals. Using state-of-the-art machine learning models, we predict that the SARS-CoV-2 RNA genome and all sgRNAs are enriched in the host mitochondrial matrix and nucleolus. The 5' and 3' viral untranslated regions possess the strongest and most distinct localization signals. We discuss the mitochondrial localization signal in relation to the formation of double-membrane vesicles, a critical stage in the coronavirus life cycle. Our computational analysis serves as a hypothesis generation tool to suggest models for SARS-CoV-2 biology and inform experimental efforts to combat the virus.
Project description:Nidoviruses produce an extensive 3'-coterminal nested set of subgenomic mRNAs, which are used to express their structural proteins. In addition, arterivirus and coronavirus mRNAs contain a common 5' leader sequence, derived from the genomic 5' end. The joining of this leader sequence to different segments (mRNA bodies) from the genomic 3'-proximal region presumably involves a unique mechanism of discontinuous minus-strand RNA synthesis. Key elements in this process are the so-called transcription-regulating sequences (TRSs), which determine a base-pairing interaction between sense and antisense viral RNA that is essential for leader-to-body joining. To identify RNA structures in the 5'-proximal region of the equine arteritis virus genome that may be involved in subgenomic mRNA synthesis, a detailed secondary RNA structure model was established using bioinformatics, phylogenetic analysis, and RNA structure probing. According to this structure model, the leader TRS is located in the loop of a prominent hairpin (leader TRS hairpin; LTH). The importance of the LTH was supported by the results of a mutagenesis study using an EAV molecular clone. Besides evidence for a direct role of the LTH in subgenomic RNA synthesis, indications for a role of the LTH region in genome replication and/or translation were obtained. Similar LTH structures could be predicted for the 5'-proximal region of all arterivirus genomes and, interestingly, also for most coronaviruses. Thus, we postulate that the LTH is a key structural element in the discontinuous subgenomic RNA synthesis and is likely critical for leader TRS function.
Project description:SARS-CoV-2 genomic and subgenomic RNA (sgRNA) transcripts hijack the host cell's machinery. Subcellular localization of its viral RNA could, thus, play important roles in viral replication and host antiviral immune response. We perform computational modeling of SARS-CoV-2 viral RNA subcellular residency across eight subcellular neighborhoods. We compare hundreds of SARS-CoV-2 genomes with the human transcriptome and other coronaviruses. We predict the SARS-CoV-2 RNA genome and sgRNAs to be enriched toward the host mitochondrial matrix and nucleolus, and that the 5' and 3' viral untranslated regions contain the strongest, most distinct localization signals. We interpret the mitochondrial residency signal as an indicator of intracellular RNA trafficking with respect to double-membrane vesicles, a critical stage in the coronavirus life cycle. Our computational analysis serves as a hypothesis generation tool to suggest models for SARS-CoV-2 biology and inform experimental efforts to combat the virus. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.