Prediction and identification of natural antisense transcripts and their small RNAs in soybean (Glycine max).
ABSTRACT: BACKGROUND: Natural antisense transcripts (NATs) are a class of RNAs that contain a sequence complementary to other transcripts. NATs occur widely in eukaryotes and play critical roles in post-transcriptional regulation. Soybean NAT sequences are predicted in the PlantNATsDB, but detailed analyses of these NATs remain to be performed. RESULTS: A total of 26,216 NATs, including 994 cis-NATs and 25,222 trans-NATs, were predicted in soybean. Each sense transcript had 1-177 antisense transcripts. We identified 21 trans-NATs using RT-PCR amplification. Additionally, we identified 179 cis-NATs and 6,629 trans-NATs that gave rise to small RNAs; these were enriched in the NAT overlapping region. The most abundant small RNAs were 21, 22, and 24 nt in length. The generation of small RNAs was biased to one stand of the NATs, and the degradation of NATs was biased. High-throughput sequencing of the degradome allowed for the global identification of NAT small interfering RNAs (nat-siRNAs) targets. 446 target genes for 165 of these nat-siRNAs were identified. The nat-siRNA target could be one transcript of a given NAT, or from other gene transcripts. We identified five NAT transcripts containing a hairpin structure that is characteristic of pre-miRNA. We identified a total of 86 microRNA (miRNA) targets that had antisense transcripts in soybean. CONCLUSIONS: We globally identified nat-siRNAs, and the targets of nat-siRNAs in soybean. It is likely that the cis-NATs, trans-NATs, nat-siRNAs, miRNAs, and miRNA targets form complex regulatory networks.
Project description:Natural antisense transcripts (NATs) are endogenous transcripts that can form double-stranded RNA structures. Many protein-coding genes (PCs) and non-protein-coding genes (NPCs) tend to form cis-NATs and trans-NATs, respectively. In this work, we identified 4,080 cis-NATs and 2,491 trans-NATs genome-widely in Arabidopsis. Of these, 5,385 NAT-siRNAs were detected from the small RNA sequencing data. NAT-siRNAs are typically 21nt, and are processed by Dicer-like 1 (DCL1)/DCL2 and RDR6 and function in epigenetically activated situations, or 24nt, suggesting these are processed by DCL3 and RDR2 and function in environment stress. NAT-siRNAs are significantly derived from PC/PC pairs of trans-NATs and NPC/NPC pairs of cis-NATs. Furthermore, NAT pair genes typically have similar pattern of epigenetic status. Cis-NATs tend to be marked by euchromatic modifications, whereas trans-NATs tend to be marked by heterochromatic modifications.
Project description:A large proportion of eukaryotic genomes is transcribed from both positive and negative strands of DNA and thus may generate overlapping sense and antisense transcripts. Some of these so-called natural antisense transcripts (NATs) are possibly co-regulated. When the overlapping sense and antisense transcripts are expressed at the same time in the same cell in response to various developmental and environmental cues; they may form double-stranded RNAs, which could be recognized by the small RNA biogenesis machinery and processed into small interfering RNAs (siRNAs). cis-NAT-derived siRNAs (nat-siRNAs) are present in plants, animals, and fungi. In plants, the presence of nat-siRNAs is supported not only by Northern blot and genetic analyses, but also by the fact that there is an overall sixfold enrichment of siRNAs in the overlapping regions of cis-NATs and 19%-29% of the siRNA-generating cis-NATs in plants give rise to siRNAs only in their overlapping regions. Silencing mediated by nat-siRNAs is one of the mechanisms for regulating the expression of the cis-NATs. This review focuses on challenging issues related to the biogenesis mechanisms as well as regulation and detection of nat-siRNAs. The advantages and limitations of new technologies for detecting cis-NATs, including direct RNA sequencing and strand-specific RNA sequencing, are also discussed.
Project description:In spite of large intergenic spaces in plant and animal genomes, 7% to 30% of genes in the genomes encode overlapping cis-natural antisense transcripts (cis-NATs). The widespread occurrence of cis-NATs suggests an evolutionary advantage for this type of genomic arrangement. Experimental evidence for the regulation of two cis-NAT gene pairs by natural antisense transcripts-generated small interfering RNAs (nat-siRNAs) via the RNA interference (RNAi) pathway has been reported in Arabidopsis. However, the extent of siRNA-mediated regulation of cis-NAT genes is still unclear in any genome.The hallmarks of RNAi regulation of NATs are 1) inverse regulation of two genes in a cis-NAT pair by environmental and developmental cues and 2) generation of siRNAs by cis-NAT genes. We examined Arabidopsis transcript profiling data from public microarray databases to identify cis-NAT pairs whose sense and antisense transcripts show opposite expression changes. A subset of the cis-NAT genes displayed negatively correlated expression profiles as well as inverse differential expression changes under at least one of the examined developmental stages or treatment conditions. By searching the Arabidopsis Small RNA Project (ASRP) and Massively Parallel Signature Sequencing (MPSS) small RNA databases as well as our stress-treated small RNA dataset, we found small RNAs that matched at least one gene in 646 pairs out of 1008 (64%) protein-coding cis-NAT pairs, which suggests that siRNAs may regulate the expression of many cis-NAT genes. 209 putative siRNAs have the potential to target more than one gene and half of these small RNAs could target multiple members of a gene family. Furthermore, the majority of the putative siRNAs within the overlapping regions tend to target only one transcript of a given NAT pair, which is consistent with our previous finding on salt- and bacteria-induced nat-siRNAs. In addition, we found that genes encoding plastid- or mitochondrion-targeted proteins are over-represented in the Arabidopsis cis-NATs and that 19% of sense and antisense partner genes of cis-NATs share at least one common Gene Ontology term, which suggests that they encode proteins with possible functional connection.The negatively correlated expression patterns of sense and antisense genes as well as the presence of siRNAs in many of the cis-NATs suggest that siRNA regulation of cis-NATs via the RNAi pathway is an important gene regulatory mechanism for at least a subgroup of cis-NATs in Arabidopsis.
Project description:Natural antisense transcripts (NATs) are coding or non-coding RNAs with sequence complementarity to other transcripts (sense transcripts). These RNAs could potentially regulate the expression of their sense partner(s) at either the transcriptional or post-transcriptional level. Experimental and computational methods have demonstrated the widespread occurrence of NATs in eukaryotes. However, most previous studies only focused on cis-NATs with little attention being paid to NATs that originate in trans.We have performed a genome-wide screen of trans-NATs in Arabidopsis thaliana and identified 1,320 putative trans-NAT pairs. An RNA annealing program predicted that most trans-NATs could form extended double-stranded RNA duplexes with their sense partners. Among trans-NATs with available expression data, more than 85% were found in the same tissue as their sense partners; of these, 67% were found in the same cell as their sense partners at comparable expression levels. For about 60% of Arabidopsis trans-NATs, orthologs of at least one transcript of the pair also had trans-NAT partners in either Populus trichocarpa or Oryza sativa. The observation that 430 transcripts had both putative cis- and trans-NATs implicates multiple regulations by antisense transcripts. The potential roles of trans-NATs in inducing post-transcriptional gene silencing and in regulating alternative splicing were also examined.The Arabidopsis transcriptome contains a fairly large number of trans-NATs, whose possible functions include silencing of the corresponding sense transcripts or altering their splicing patterns. The interlaced relationships observed in some cis- and trans-NAT pairs suggest that antisense transcripts could be involved in complex regulatory networks in eukaryotes.
Project description:Many eukaryotic genomes encode cis-natural antisense transcripts (cis-NATs). Sense and antisense transcripts may form double-stranded RNAs that are processed by the RNA interference machinery into small interfering RNAs (siRNAs). A few so-called nat-siRNAs have been reported in plants, mammals, Drosophila, and yeasts. However, many questions remain regarding the features and biogenesis of nat-siRNAs.Through deep sequencing, we identified more than 17,000 unique siRNAs corresponding to cis-NATs from biotic and abiotic stress-challenged Arabidopsis thaliana and 56,000 from abiotic stress-treated rice. These siRNAs were enriched in the overlapping regions of NATs and exhibited either site-specific or distributed patterns, often with strand bias. Out of 1,439 and 767 cis-NAT pairs identified in Arabidopsis and rice, respectively, 84 and 119 could generate at least 10 siRNAs per million reads from the overlapping regions. Among them, 16 cis-NAT pairs from Arabidopsis and 34 from rice gave rise to nat-siRNAs exclusively in the overlap regions. Genetic analysis showed that the overlapping double-stranded RNAs could be processed by Dicer-like 1 (DCL1) and/or DCL3. The DCL3-dependent nat-siRNAs were also dependent on RNA-dependent RNA polymerase 2 (RDR2) and plant-specific RNA polymerase IV (PolIV), whereas only a fraction of DCL1-dependent nat-siRNAs was RDR- and PolIV-dependent. Furthermore, the levels of some nat-siRNAs were regulated by specific biotic or abiotic stress conditions in Arabidopsis and rice.Our results suggest that nat-siRNAs display distinct distribution patterns and are generated by DCL1 and/or DCL3. Our analysis further supported the existence of nat-siRNAs in plants and advanced our understanding of their characteristics.
Project description:BACKGROUND: Cis-natural antisense transcripts (cis-NATs) are RNAs transcribed from the antisense strand of a gene locus, and are complementary to the RNA transcribed from the sense strand. Common techniques including microarray approach and analysis of transcriptome databases are the major ways to globally identify cis-NATs in various eukaryotic organisms. Genome-wide in silico analysis has identified a large number of cis-NATs that may generate endogenous short interfering RNAs (nat-siRNAs), which participate in important biogenesis mechanisms for transcriptional and post-transcriptional regulation in rice. However, the transcriptomes are yet to be deeply sequenced to comprehensively investigate cis-NATs. RESULTS: We applied high-throughput strand-specific complementary DNA sequencing technology (ssRNA-seq) to deeply sequence mRNA for assessing sense and antisense transcripts that were derived under salt, drought and cold stresses, and normal conditions, in the model plant rice (Oryza sativa). Combined with RAP-DB genome annotation (the Rice Annotation Project Database build-5 data set), 76,013 transcripts corresponding to 45,844 unique gene loci were assembled, in which 4873 gene loci were newly identified. Of 3819 putative rice cis-NATs, 2292 were detected as expressed and giving rise to small RNAs from their overlapping regions through integrated analysis of ssRNA-seq data and small RNA data. Among them, 503 cis-NATs seemed to be associated with specific conditions. The deep sequence data from isolated epidermal cells of rice seedlings further showed that 54.0% of cis-NATs were expressed simultaneously in a population of homogenous cells. Nearly 9.7% of rice transcripts were involved in one-to-one or many-to-many cis-NATs formation. Furthermore, only 17.4-34.7% of 223 many-to-many cis-NAT groups were all expressed and generated nat-siRNAs, indicating that only some cis-NAT groups may be involved in complex regulatory networks. CONCLUSIONS: Our study profiles an abundance of cis-NATs and nat-siRNAs in rice. These data are valuable for gaining insight into the complex function of the rice transcriptome.
Project description:<h4>Background</h4>Naturally occurring antisense transcripts (NATs) are non-coding RNAs that may regulate the activity of sense transcripts to which they bind because of complementarity. NATs that are not located in the gene they regulate (trans-NATs) have better chances to evolve than cis-NATs, which is evident when the sense strand of the cis-NAT is part of a protein coding gene. However, the generation of a trans-NAT requires the formation of a relatively large region of complementarity to the gene it regulates.<h4>Results</h4>Pseudogene formation may be one evolutionary mechanism that generates trans-NATs to the parental gene. For example, this could occur if the parental gene is regulated by a cis-NAT that is copied as a trans-NAT in the pseudogene. To support this we identified human pseudogenes with a trans-NAT to the parental gene in their antisense strand by analysis of the database of expressed sequence tags (ESTs). We found that the mutations that appeared in these trans-NATs after the pseudogene formation do not show the flat distribution that would be expected in a non functional transcript. Instead, we found higher similarity to the parental gene in a region nearby the 3' end of the trans-NATs.<h4>Conclusions</h4>Our results do not imply a functional relation of the trans-NAT arising from pseudogenes over their respective parental genes but add evidence for it and stress the importance of duplication mechanisms of genetic material in the generation of non-coding RNAs. We also provide a plausible explanation for the large transcripts that can be found in the antisense strand of some pseudogenes.
Project description:BACKGROUND: Brassica rapa includes several important leaf vegetable crops whose production is often damaged by high temperature. Cis-natural antisense transcripts (cis-NATs) and cis-NATs-derived small interfering RNAs (nat-siRNAs) play important roles in plant development and stress responses. However, genome-wide cis-NATs in B. rapa are not known. The NATs and nat-siRNAs that respond to heat stress have never been well studied in B. rapa. Here, we took advantage of RNA-seq and small RNA (sRNA) deep sequencing technology to identify cis-NATs and heat responsive nat-siRNAs in B. rapa. RESULTS: Analyses of four RNA sequencing datasets revealed 1031 cis-NATs B. rapa ssp. chinensis cv Wut and B. rapa ssp. pekinensis cv. Bre. Based on sequence homology between Arabidopsis thaliana and B. rapa, 303 conserved cis-NATs in B. rapa were found to correspond to 280 cis-NATs in Arabidopsis; the remaining 728 novel cis-NATs were identified as Brassica-specific ones. Using six sRNA libraries, 4846 nat-siRNAs derived from 150 cis-NATs were detected. Differential expression analysis revealed that nat-siRNAs derived from 12 cis-NATs were responsive to heat stress, and most of them showed strand bias. Real-time PCR indicated that most of the transcripts generating heat-responsive nat-siRNAs were upregulated under heat stress, while the transcripts from the opposite strands of the same loci were downregulated. CONCLUSIONS: Our results provide the first subsets of genome-wide cis-NATs and heat-responsive nat-siRNAs in B. rapa; these sRNAs are potentially useful for the genetic improvement of heat tolerance in B. rapa and other crops.
Project description:Nat-siRNAs (small interfering RNAs originated from natural antisense transcripts) are a class of functional small RNA (sRNA) species discovered in both plants and animals. These siRNAs are highly enriched within the annealed regions of the NAT (natural antisense transcript) pairs. To date, great research efforts have been taken for systematical identification of the NATs in various organisms. However, developing a freely available and easy-to-use program for NAT prediction is strongly demanded by researchers. Here, we proposed an integrative pipeline named NATpipe for systematical discovery of NATs from de novo assembled transcriptomes. By utilizing sRNA sequencing data, the pipeline also allowed users to search for phase-distributed nat-siRNAs within the perfectly annealed regions of the NAT pairs. Additionally, more reliable nat-siRNA loci could be identified based on degradome sequencing data. A case study on the non-model plant Dendrobium officinale was performed to illustrate the utility of NATpipe. Finally, we hope that NATpipe would be a useful tool for NAT prediction, nat-siRNA discovery, and related functional studies. NATpipe is available at www.bioinfolab.cn/NATpipe/NATpipe.zip.
Project description:Natural antisense transcripts (NAT) are a class of endogenous coding or non-protein-coding RNAs with sequence complementarity to other transcripts. Several lines of evidence have shown that cis- and trans-NATs may participate in a broad range of gene regulatory events. Genome-wide identification of cis-NATs in human, mouse and rice has revealed their widespread occurrence in eukaryotes. However, little is known about cis-NATs in the model plant Arabidopsis thaliana.We developed a new computational method to predict and identify cis-encoded NATs in Arabidopsis and found 1,340 potential NAT pairs. The expression of both sense and antisense transcripts of 957 NAT pairs was confirmed using Arabidopsis full-length cDNAs and public massively parallel signature sequencing (MPSS) data. Three known or putative Arabidopsis imprinted genes have cis-antisense transcripts. Sequences and the genomic arrangement of two Arabidopsis NAT pairs are conserved in rice.We combined information from full-length cDNAs and Arabidopsis genome annotation in our NAT prediction work and reported cis-NAT pairs that could not otherwise be identified by using one of the two datasets only. Analysis of MPSS data suggested that for most Arabidopsis cis-NAT pairs, there is predominant expression of one of the two transcripts in a tissue-specific manner.