Computational identification of putative lincRNAs in mouse embryonic stem cell.
ABSTRACT: As the regulatory factors, lncRNAs play critical roles in embryonic stem cells. And lincRNAs are most widely studied lncRNAs, however, there might still might exist a large member of uncovered lncRNAs. In this study, we constructed the de novo assembly of transcriptome to detect 6,701 putative long intergenic non-coding transcripts (lincRNAs) expressed in mouse embryonic stem cells (ESCs), which might be incomplete with the lack coverage of 5' ends assessed by CAGE peaks. Comparing the TSS proximal regions between the known lincRNAs and their closet protein coding transcripts, our results revealed that the lincRNA TSS proximal regions are associated with the characteristic genomic and epigenetic features. Subsequently, 1,293 lincRNAs were corrected at their 5' ends using the putative lincRNA TSS regions predicted by the TSS proximal region prediction model based on genomic and epigenetic features. Finally, 43 putative lincRNAs were annotated by Gene Ontology terms. In conclusion, this work provides a novel catalog of mouse ESCs-expressed lincRNAs with the relatively complete transcript length, which might be useful for the investigation of transcriptional and post-transcriptional regulation of lincRNA in mouse ESCs and even mammalian development.
Project description:Many nascent long non-coding RNAs (lncRNAs) undergo the same maturation steps as pre-mRNAs of protein-coding genes (PCGs), but they are often poorly spliced. To identify the underlying mechanisms for this phenomenon, we searched for putative splicing inhibitory sequences using the ncRNA-a2 as a model. Genome-wide analyses of intergenic lncRNAs (lincRNAs) revealed that lincRNA splicing efficiency positively correlates with 5'ss strength while no such correlation was identified for PCGs. In addition, efficiently spliced lincRNAs have higher thymidine content in the polypyrimidine tract (PPT) compared to efficiently spliced PCGs. Using model lincRNAs, we provide experimental evidence that strengthening the 5'ss and increasing the T content in PPT significantly enhances lincRNA splicing. We further showed that lincRNA exons contain less putative binding sites for SR proteins. To map binding of SR proteins to lincRNAs, we performed iCLIP with SRSF2, SRSF5 and SRSF6 and analyzed eCLIP data for SRSF1, SRSF7 and SRSF9. All examined SR proteins bind lincRNA exons to a much lower extent than expression-matched PCGs. We propose that lincRNAs lack the cooperative interaction network that enhances splicing, which renders their splicing outcome more dependent on the optimality of splice sites.
Project description:Next Generation Sequencing (NGS) strategies, like RNA-Seq, have revealed the transcription of a wide variety of long non-coding RNAs (lncRNAs) in the genomes of several organisms. In the present work we assessed the lncRNAs complement of Schistosoma mansoni, the blood fluke that causes schistosomiasis, ranked among the most prevalent parasitic diseases worldwide. We focused on the long intergenic/intervening ncRNAs (lincRNAs), hidden within the large amount of information obtained through RNA-Seq in S. mansoni (88 libraries). Our computational pipeline identified 7029 canonically-spliced putative lincRNA genes on 2596 genomic loci (at an average 2.7 isoforms per lincRNA locus), as well as 402 spliced lncRNAs that are antisense to protein-coding (PC) genes. Hundreds of lincRNAs showed traits for being functional, such as the presence of epigenetic marks at their transcription start sites, evolutionary conservation among other schistosome species and differential expression across five different life-cycle stages of the parasite. Real-time qPCR has confirmed the differential life-cycle stage expression of a set of selected lincRNAs. We have built PC gene and lincRNA co-expression networks, unraveling key biological processes where lincRNAs might be involved during parasite development. This is the first report of a large-scale identification and structural annotation of lncRNAs in the S. mansoni genome.
Project description:Long noncoding RNAs (lncRNAs) have emerged as important regulators of many biological processes, including embryogenesis and development. To provide a systematic analysis of lncRNAs expressed during chicken embryogenesis, we used Iso-Seq and RNA-Seq to identify potential lncRNAs at embryonic stages from d 1 to d 8 of incubation: sequential stages covering gastrulation, somitogenesis, and organogenesis. The data characterized an expanded landscape of lncRNAs, yielding 45,410 distinct lncRNAs (31,282 genes). Amongst these, a set of 13,141 filtered intergenic lncRNAs (lincRNAs) transcribed from 9803 lincRNA gene loci, of which, 66.5% were novel, were further analyzed. These lincRNAs were found to share many characteristics with mammalian lincRNAs, including relatively short lengths, fewer exons, lower expression levels, and stage-specific expression patterns. Functional studies motivated by "guilt-by-association" associated individual lincRNAs with specific GO functions, providing an important resource for future studies of lincRNA function. Most importantly, a weighted gene co-expression network analysis suggested that genes of the brown module were specifically associated with the day 2 stage. LincRNAs within this module were co-expressed with proteins involved in hematopoiesis and lipid metabolism. This study presents the systematic identification of lincRNAs in developing chicken embryos and will serve as a powerful resource for the study of lincRNA functions.
Project description:While thousands of large intergenic non-coding RNAs (lincRNAs) have been identified in mammals, few have been functionally characterized leading to debate about their biological role. To address this, we performed loss-of-function studies on most lincRNAs expressed in mouse embryonic stem cells (ESC) and characterized the effects on gene expression. Here we show that knockdown of lincRNAs have major consequences on gene expression patterns, comparable to knockdown of well-known ESC regulators. Notably, lincRNAs primarily affect gene expression in trans. We identify dozens of lincRNAs whose knockdown causes an exit from the pluripotent state or upregulation of lineage commitment programs. We integrate lincRNAs into the molecular circuitry of ESCs and show that lincRNA genes are regulated by key transcription factors and that lincRNA transcripts physically bind to multiple chromatin regulatory proteins to affect shared gene expression programs. Together, the results demonstrate that lincRNAs have key roles in the circuitry controlling ESC state. We generated five lentiviral-based shRNAs targeting each of the 237 lincRNAs previously identified in ESCs. These shRNAs successfully targeted 147 lincRNAs and reduced their expression by an average of ~75% compared to endogenous levels in ESCs. As positive controls, we generated shRNAs targeting ~50 genes encoding regulatory proteins, including both transcription factor and chromatin factor genes that have been shown to play critical roles in ESC regulation; we obtained validated hairpins against 40 of these genes. As negative controls, we performed independent infections with lentiviruses containing 27 different shRNAs with no known cellular target RNA.
Project description:Long non-coding RNAs (lncRNAs) play important roles in genomic imprinting, cancer, differentiation and regulation of gene expression. Here, we identified 3844 long intergenic ncRNAs (lincRNA) in Plutella xylostella, which is a notorious pest of cruciferous plants that has developed field resistance to all classes of insecticides, including Bacillus thuringiensis (Bt) endotoxins. Further, we found that some of those lincRNAs may potentially serve as precursors for the production of small ncRNAs. We found 280 and 350 lincRNAs that are differentially expressed in Chlorpyrifos and Fipronil resistant larvae. A survey on P. xylostella midgut transcriptome data from Bt-resistant populations revealed 59 altered lincRNA in two resistant strains compared with the susceptible population. We validated the transcript levels of a number of putative lincRNAs in deltamethrin-resistant larvae that were exposed to deltamethrin, which indicated that this group of lincRNAs might be involved in the response to xenobiotics in this insect. To functionally characterize DBM lincRNAs, gene ontology (GO) enrichment of their associated protein-coding genes was extracted and showed over representation of protein, DNA and RNA binding GO terms. The data presented here will facilitate future studies to unravel the function of lincRNAs in insecticide resistance or the response to xenobiotics of eukaryotic cells.
Project description:Long intergenic noncoding RNAs (lincRNAs) are increasingly recognized as important mediators of many biological processes relevant to human pathophysiologies, including cardiovascular diseases. In vitro studies have provided important knowledge of cellular functions and mechanisms for an increasing number of lincRNAs. Dysregulated lncRNAs have been associated with cell fate programming and development, vascular diseases, atherosclerosis, dyslipidemia and metabolic syndrome, and cardiac pathological hypertrophy. However, functional interrogation of individual lincRNAs in physiological and disease states is largely limited. The complex nature of lincRNA actions and poor species conservation of human lincRNAs pose substantial challenges to physiological studies in animal model systems and in clinical translation. This review summarizes recent findings of specific lincRNA physiological studies, including MALAT1, MeXis, Lnc-DC and others, in the context of cardiovascular diseases, examines complex mechanisms of lincRNA actions, reviews in vivo research strategies to delineate lincRNA functions and highlights challenges and approaches for physiological studies of primate-specific lincRNAs.
Project description:Long intergenic non-coding RNAs (lincRNAs) are appearing as an important class of regulatory RNAs with a variety of biological functions. The aim of this study was to identify the lincRNA profile in the dengue vector Aedes aegypti and evaluate their potential role in host-pathogen interaction. The majority of previous RNA-Seq transcriptome studies in Ae. aegypti have focused on the expression pattern of annotated protein coding genes under different biological conditions. Here, we used 35 publically available RNA-Seq datasets with relatively high depth to screen the Ae. aegypti genome for lincRNA discovery. This led to the identification of 3,482 putative lincRNAs. These lincRNA genes displayed a slightly lower GC content and shorter transcript lengths compared to protein-encoding genes. Ae. aegypti lincRNAs also demonstrate low evolutionary sequence conservation even among closely related species such as Culex quinquefasciatus and Anopheles gambiae. We examined their expression in dengue virus serotype 2 (DENV-2) and Wolbachia infected and non-infected adult mosquitoes and Aa20 cells. The results revealed that DENV-2 infection increased the abundance of a number of host lincRNAs, from which some suppress viral replication in mosquito cells. RNAi-mediated silencing of lincRNA_1317 led to enhancement in viral replication, which possibly indicates its potential involvement in the host anti-viral defense. A number of lincRNAs were also differentially expressed in Wolbachia-infected mosquitoes. The results will facilitate future studies to unravel the function of lncRNAs in insects and may prove to be beneficial in developing new ways to control vectors or inhibit replication of viruses in them.
Project description:BACKGROUND: The stability of long intergenic non-coding RNAs (lincRNAs) that possess tissue/cell-specific expression, might be closely related to their physiological functions. However, the mechanism associated with stability of lincRNA remains elusive. In this study, we try to study the stability of lincRNA in K562 cells, an important model cell, through comparing two K562 transcriptomes which are obtained from ENCODE Consortium and our sequenced RNA-Seq dataset (PH) respectively. RESULTS: By lincRNAs analysis pipeline, 1804 high-confidence lincRNAs involving 1564 annotated lincRNAs and 240 putative novel lincRNAs were identified in PH, and 1587 high-confidence lincRNAs including 1429 annotated lincRNAs and 158 putative novel lincRNAs in ENCODE. There are 1009 unique lincRNAs in PH, 792 unique lincRNAs were in ENCODE, and 795 overlapping lincRNAs in both datasets. The analysis of differences in minimum free energy distribution and lincRNA half-life showed that a large proportion of overlapping lincRNAs were more stable than the unique lincRNAs. Most lincRNAs were more unstable than protein-coding RNAs through comparing their minimum free energy. CONCLUSIONS: Identification of overlapping and unique lincRNAs can be helpful to classify the stability of lincRNAs. Our results suggest that overlapping lincRNAs (relatively stable linRNAs) and unique lincRNAs (relatively unstable lincRNAs) might be involved in different cellular processes. REVIEWERS: This article has been reviewed by Prof. Oliviero Carugo, Dr. Alistair Forrest and Prof. Manju Bansal.
Project description:<h4>Background</h4>The aetiology of Crohn's disease [CD] involves immune dysregulation in a genetically susceptible individual. Genome-wide association studies [GWAS] have identified 200 loci associated with CD, ulcerative colitis, or both, most of which fall within non-coding DNA regions. Long non-coding RNAs [lncRNAs] regulate gene expression by diverse mechanisms and have been associated with disease activity in inflammatory bowel disease. However, disease-associated lncRNAs have not been characterised in pathogenic immune cell populations.<h4>Methods</h4>Terminal ileal samples were obtained from 22 CD patients and 13 controls. RNA from lamina propria CD4+ T cells was sequenced and long intergenic non-coding RNAs [lincRNAs] were detected. Overall expression patterns, differential expression [DE], and pathway and gene enrichment analyses were performed. Knockdown of novel lincRNAs XLOC_000261 and XLOC_000014 was performed. Expression of Th1 or Th17-associated transcription factors, T-bet and ROR?t, respectively, was assessed by flow cytometry.<h4>Results</h4>A total of 6402 lincRNAs were expressed, 960 of which were novel. Unsupervised clustering and principal component analysis showed that the lincRNA expression discriminated patients from controls. A total of 1792 lincRNAs were DE, and 295 [79 novel; 216 known] mapped to 267 of 5727 DE protein-coding genes. The novel lincRNAs were enriched in inflammatory and Notch signalling pathways [p <0.05]. Furthermore, DE lincRNAs in CD patients were more frequently found in DNA regions with known inflammatory bowel disease [IBD]-associated loci. The novel lincRNA XLOC_000261 negatively regulated ROR?t expression in Th17 cells.<h4>Conclusions</h4>We describe a novel set of DE lincRNAs in CD-associated CD4+ cells and demonstrate that novel lincRNA XLOC_000261 appears to negatively regulate ROR?t protein expression in Th17 cells.
Project description:Long non protein coding RNAs (lncRNAs) have been identified in many different organisms and cell types. Emerging examples emphasize the biological importance of these RNA species but their regulation and functions remain poorly understood. In the filamentous fungus Neurospora crassa, the annotation and characterization of lncRNAs is incomplete.We have performed a comprehensive transcriptome analysis of Neurospora crassa by using ChIP-seq, RNA-seq and polysome fractionation datasets. We have annotated and characterized 1478 long intergenic noncoding RNAs (lincRNAs) and 1056 natural antisense transcripts, indicating that 20% of the RNA Polymerase II transcripts of Neurospora are not coding for protein. Both classes of lncRNAs accumulate at lower levels than protein-coding mRNAs and they are considerably shorter. Our analysis showed that the vast majority of lincRNAs and antisense transcripts do not contain introns and carry less H3K4me2 modifications than similarly expressed protein coding genes. In contrast, H3K27me3 modifications inversely correlate with transcription of protein coding and lincRNA genes. We show furthermore most lincRNA sequences evolve rapidly, even between phylogenetically close species.Our transcriptome analyses revealed distinct features of Neurospora lincRNAs and antisense transcripts in comparison to mRNAs and showed that the prevalence of noncoding transcripts in this organism is higher than previously anticipated. The study provides a broad repertoire and a resource for further studies of lncRNAs.