Novel insight into the non-coding repertoire through deep sequencing analysis.
ABSTRACT: Non-coding RNAs (ncRNA) account for a large portion of the transcribed genomic output. This diverse family of untranslated RNA molecules play a crucial role in cellular function. The use of 'deep sequencing' technology (also known as 'next generation sequencing') to infer transcript expression levels in general, and ncRNA specifically, is becoming increasingly common in molecular and clinical laboratories. We developed a software termed 'RandA' (which stands for ncRNA Read-and-Analyze) that performs comprehensive ncRNA profiling and differential expression analysis on deep sequencing generated data through a graphical user interface running on a local personal computer. Using RandA, we reveal the complexity of the ncRNA repertoire in a given cell population. We further demonstrate the relevance of such an extensive ncRNA analysis by elucidating a multitude of characterizing features in pathogen infected mammalian cells. RandA is available for download at http://ibis.tau.ac.il/RandA.
Project description:AIM:The present study was designed to identify other noncoding RNAs (ncRNAs) in the corpus luteum (CL) during early pregnancy in buffalo. MATERIALS AND METHODS:For this study, CL (n=2) from two buffalo gravid uteri, obtained from the slaughter house, was transported to laboratory after snap freezing in liquid nitrogen (-196°C). The stage of pregnancy was determined by measuring the crown-rump region of the fetus. This was followed by isolation of RNA and deep sequencing. Post-deep sequencing, the obtained reads were checked and aligned against various ncRNA databases (GtRNA, RFAM, and deep guide). Various parameters, namely, frequency of specific ncRNAs, length, mismatch, and genomic location target in several model species were deciphered. RESULTS:Frequency of piwi-interacting RNAs (piwi-RNAs), having target location in rodents and human genomes, were significantly higher compared to other piwi-RNAs and ncRNAs. Ribosomal RNAs (rRNAs) deduced had nucleotides (nts) ranging from 17 to 50 nts, but the occurrence of small length rRNAs was more than lengthier fragments. The target on 16S rRNA species confirms the conservation of 16S rRNA across species. With respect to transfer RNA (tRNA), the abundantly occurring tRNAs were unique with no duplication. Small nucleolar RNAs (snoRNAs), identified in this study, showed a strong tendency for coding box C/D snoRNAs in comparison to H/ACA snoRNAs. Regulatory and evolutionary implications of these identified ncRNAs are yet to be delineated in many species, including buffaloes. CONCLUSION:This is the first report of identification of other ncRNAs in CL of early pregnancy in buffalo.
Project description:BACKGROUND: In the last decade, sequencing projects have led to the development of a number of annotation systems dedicated to the structural and functional annotation of protein-coding genes. These annotation systems manage the annotation of the non-protein coding genes (ncRNAs) in a very crude way, allowing neither the edition of the secondary structures nor the clustering of ncRNA genes into families which are crucial for appropriate annotation of these molecules. RESULTS: LeARN is a flexible software package which handles the complete process of ncRNA annotation by integrating the layers of automatic detection and human curation. CONCLUSION: This software provides the infrastructure to deal properly with ncRNAs in the framework of any annotation project. It fills the gap between existing prediction software, that detect independent ncRNA occurrences, and public ncRNA repositories, that do not offer the flexibility and interactivity required for annotation projects. The software is freely available from the download section of the website http://bioinfo.genopole-toulouse.prd.fr/LeARN.
Project description:Antibody repertoire analysis by high throughput sequencing is now widely used, but a persisting challenge is enabling immunologists to explore their data to discover discriminating repertoire features for their own particular investigations. Computational methods are necessary for large-scale evaluation of antibody properties. We have developed BRepertoire, a suite of user-friendly web-based software tools for large-scale statistical analyses of repertoire data. The software is able to use data preprocessed by IMGT, and performs statistical and comparative analyses with versatile plotting options. BRepertoire has been designed to operate in various modes, for example analysing sequence-specific V(D)J gene usage, discerning physico-chemical properties of the CDR regions and clustering of clonotypes. Those analyses are performed on the fly by a number of R packages and are deployed by a shiny web platform. The user can download the analysed data in different table formats and save the generated plots as image files ready for publication. We believe BRepertoire to be a versatile analytical tool that complements experimental studies of immune repertoires. To illustrate the server's functionality, we show use cases including differential gene usage in a vaccination dataset and analysis of CDR3H properties in old and young individuals. The server is accessible under http://mabra.biomed.kcl.ac.uk/BRepertoire.
Project description:Epstein-Barr virus (EBV) infection of human B cells requires the presence of non-coding RNAs (ncRNAs), which regulate expression of viral and host genes. To identify differentially expressed regulatory ncRNAs involved in EBV infection, a specialized cDNA library, enriched for ncRNAs derived from EBV-infected cells, was subjected to deep-sequencing. From the deep-sequencing analysis, we generated a custom-designed ncRNA-microchip to investigate differential expression of ncRNA candidates. By this approach, we identified 25 differentially expressed novel host-encoded ncRNA candidates in EBV-infected cells, comprised of six non-repeat-derived and 19 repeat-derived ncRNAs. Upon EBV infection of B cells, we also observed increased expression levels of oncogenic miRNAs mir-221 and mir-222, which might contribute to EBV-related tumorigenesis, as well as decreased expression levels of RNase P RNA, a ribozyme involved in tRNA maturation. Thus, in this study we demonstrate that our ncRNA-microchip approach serves as a powerful tool to identify novel differentially expressed ncRNAs acting as potential regulators of gene expression during EBV infection.
Project description:Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Project description:Imprinted noncoding RNAs (ncRNAs) are expressed mono-allelically in a parent-of-origin-dependent manner, which is mainly evident in mammals. Lying at a crossroad between imprinted genes and ncRNAs, imprinted ncRNAs show distinct features. They are likely to function in nontraditional ways compared to non-imprinted ncRNAs, and are much more responsible for the mechanism of genomic imprinting compared to imprinted protein-coding genes. An increasing number of human diseases have been shown to be related to abnormalities in imprinted ncRNAs. Due to their functional importance, many studies focusing on imprinted ncRNAs have been published in recent years; however, there is no systematic collection or description of imprinted ncRNAs and the rapidly growing knowledge is scattered in various places. Here, we describe a new database, ncRNAimprint, which serves as a comprehensive resource center for mammalian imprinted ncRNAs. A catalog of imprinted ncRNAs, including snoRNAs, microRNAs, piRNAs, siRNAs, antisense ncRNAs, and mRNA-like ncRNAs, was annotated in detail using information extracted from relevant literature and databases. Comprehensive collections of imprinted ncRNA-related diseases, imprinting control regions (ICRs), and imprinted regions were manually compiled to provide resources for current research focusing on imprinted ncRNAs. Small RNA deep sequencing reads that fully matched within imprinted regions were also included to offer useful evidence in detecting novel imprinted ncRNAs and to aid in analyzing expression patterns of known imprinted ncRNAs. A search page including four effective search forms and two graphical browsers was created for rapid retrieval and visualization of these data. The imprinted ncRNA database is freely accessible at http://rnaqueen.sysu.edu.cn/ncRNAimprint.
Project description:Enterococcus faecalis is a commensal bacterium and a major opportunistic human pathogen. In this study, we combined in silico predictions with a novel 5'RACE-derivative method coined '5'tagRACE', to perform the first search for non-coding RNAs (ncRNAs) encoded on the E. faecalis chromosome. We used the 5'tagRACE to simultaneously probe and characterize primary transcripts, and demonstrate here the simplicity, the reliability and the sensitivity of the method. The 5'tagRACE is complementary to tiling arrays or RNA-sequencing methods, and is also directly applicable to deep RNA sequencing and should significantly improve functional studies of bacterial RNA landscapes. From 45 selected loci of the E. faecalis chromosome, we discovered and mapped 29 novel ncRNAs, 10 putative novel mRNAs and 16 antisense transcriptional organizations. We describe in more detail the oxygen-dependent expression of one ncRNA located in an E. faecalis pathogenicity island, the existence of an ncRNA that is antisense to the ncRNA modulator of the RNA polymerase, SsrS and provide evidences for the functional interplay between two distinct toxin-antitoxin modules.
Project description:Long non-coding RNAs (lncRNAs) have emerged as a class of factors that are important for regulating development and cancer. Computational prediction of lncRNAs from ultra-deep RNA sequencing has been successful in identifying candidate lncRNAs. However, the complexity of handling and integrating different types of genomics data poses significant challenges to experimental laboratories that lack extensive genomics expertise.To address this issue, we have developed lncRNA-screen, a comprehensive pipeline for computationally screening putative lncRNA transcripts over large multimodal datasets. The main objective of this work is to facilitate the computational discovery of lncRNA candidates to be further examined by functional experiments. lncRNA-screen provides a fully automated easy-to-run pipeline which performs data download, RNA-seq alignment, assembly, quality assessment, transcript filtration, novel lncRNA identification, coding potential estimation, expression level quantification, histone mark enrichment profile integration, differential expression analysis, annotation with other type of segmented data (CNVs, SNPs, Hi-C, etc.) and visualization. Importantly, lncRNA-screen generates an interactive report summarizing all interesting lncRNA features including genome browser snapshots and lncRNA-mRNA interactions based on Hi-C data.lncRNA-screen provides a comprehensive solution for lncRNA discovery and an intuitive interactive report for identifying promising lncRNA candidates. lncRNA-screen is available as open-source software on GitHub.
Project description:Transposable elements (TEs) play an essential role in the genetic variability of eukaryotic species. In plants, they may comprise up to 90% of the total genome. Non-coding RNAs (ncRNAs) are known to control gene expression and regulation. Although the relationship between ncRNAs and TEs is known, obtaining the organized data for sequenced genomes is not straightforward. In this study, we describe the PlaNC-TE (http://planc-te.cp.utfpr.edu.br), a user-friendly portal harboring a knowledgebase created by integrating and analysing plant ncRNA-TE data. We identified a total of 14 350 overlaps between ncRNAs and TEs in 40 plant genomes. The database allows users to browse, search and download all ncRNA and TE data analysed. Overall, PlaNC-TE not only organizes data and provides insights about the relationship between ncRNA and TEs in plants but also helps improve genome annotation strategies. Moreover, this is the first database to provide resources to broadly investigate functions and mechanisms involving TEs and ncRNAs in plants.