Project description:Large intergenic non-coding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lincRNAs remains a challenge. Recent advances in RNA sequencing (RNA-Seq) and computational methods allow for an unprecedented analysis of such transcripts. Here, we present an integrative approach to define a reference catalogue of over 8,000 human lincRNAs. Our catalogue unifies previously existing annotation sources with transcripts we assembled from RNA-Seq data collected from ~4 billion RNA-Seq reads across 24 tissues and cell types. We characterize each lincRNA by a panorama of more than 30 properties, including sequence, structural, transcriptional, and orthology features. We find that lincRNA expression is strikingly tissue specific compared to coding genes, and that they are typically co-expressed with their neighboring genes, albeit to a similar extent to that of pairs of neighboring protein-coding genes. We distinguish an additional sub-set of transcripts that have high evolutionary conservation but may include short open reading frames, and may serve either as lincRNAs or as small peptides. Our integrated, comprehensive, yet conservative reference catalogue of human lincRNAs reveals the global properties of lincRNAs and will facilitate experimental studies and further functional classification of these genes. We extracted profiled the transcriptome expression polyadenylated mRNA-Seq. We then used these to reconstruct the transcriptome using de-novo assemblers and identify long non coding RNAs and their expression.
Project description:This SuperSeries is composed of the following subset Series: GSE23968: Large intergenic non-coding RNAs as novel modulators of reprogramming: ESCs, fibroblast, and fibroblast-derived iPSC (gene expression) GSE23970: Large intergenic non-coding RNAs as novel modulators of reprogramming: human embryonic stem cells, CD34+ cells, and CD34+ derived induced pluripotent stem cells (LincRNA expression) GSE23973: Large intergenic non-coding RNAs as novel modulators of reprogramming: siRNA (gene expression) GSE24181: Large intergenic non-coding RNAs as novel modulators of reprogramming: human embryonic stem cells, fibroblasts, and fibroblast-derived induced pluripotent stem cells (LincRNA expression) Refer to individual Series
Project description:Large intergenic non-coding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lincRNAs remains a challenge. Recent advances in RNA sequencing (RNA-Seq) and computational methods allow for an unprecedented analysis of such transcripts. Here, we present an integrative approach to define a reference catalogue of over 8,000 human lincRNAs. Our catalogue unifies previously existing annotation sources with transcripts we assembled from RNA-Seq data collected from ~4 billion RNA-Seq reads across 24 tissues and cell types. We characterize each lincRNA by a panorama of more than 30 properties, including sequence, structural, transcriptional, and orthology features. We find that lincRNA expression is strikingly tissue specific compared to coding genes, and that they are typically co-expressed with their neighboring genes, albeit to a similar extent to that of pairs of neighboring protein-coding genes. We distinguish an additional sub-set of transcripts that have high evolutionary conservation but may include short open reading frames, and may serve either as lincRNAs or as small peptides. Our integrated, comprehensive, yet conservative reference catalogue of human lincRNAs reveals the global properties of lincRNAs and will facilitate experimental studies and further functional classification of these genes.
Project description:We determined the strand-specific transcriptome of the fission yeast S. pombe under multiple growth conditions using a novel RNA/DNA hybridization mapping (HybMap) technique. HybMap uses an antibody against an RNA/DNA hybrid to detect RNA molecules hybridized to a high density DNA oligonucleotide tiling microarray. HybMap exhibited exceptional dynamic range and reproducibility, and clearly revealed coding, non-coding and structural RNAs, as well as new RNAs conserved in distant yeast species. Virtually the entire euchromatic genome (including intergenics) is transcribed, with heterochromatin dampening intergenic transcription. Transcriptomes of alternative growth conditions reveal changes in both coding and non-coding RNAs. Interestingly, our analysis reveals large numbers of non-coding RNAs, extensive antisense transcription, new properties of antisense transcripts, and induced divergent transcription. Furthermore, HybMap informed the efficiency and locations of RNA splicing genome-wide. Finally, a remarkable feature is observed at heterochromatin boundaries inside centromeres; strand-specific transcription islands around tRNAs. These new features are discussed in terms of organism fitness and transcriptome evolution. Keywords: yeast, gene expression, bioinformatics
Project description:Interventions: Case series:Nil
Primary outcome(s): intestinal microecological disorders;blood non-coding RNAs and immune status
Study Design: Randomized parallel controlled trial
Project description:Although a large proportion of human transcription occurs outside the boundaries of known genes, the functional significance of this transcription remains unknown. We have compared the expression patterns of known genes as well as intergenic transcripts within the ENCODE regions between humans and chimpanzees in brain, heart, testis and lymphoblastoid cell lines. We find that intergenic transcripts show patterns of tissue-specific conservation of their expression which are comparable to exonic transcripts of known genes. This suggests that intergenic transcripts are subject to functional constraints that restrict their rate of evolutionary change as well as putative positive selection to an extent comparable to that of classical protein-coding genes. In brain and testis, we find that part of this intergenic transcription is caused by wide-spread use of alternative promoters. Further, we find that about half of the expression differences between humans and chimpanzees are due to intergenic transcripts. In order to transfer genetic information encoded in the genomic sequence of an organism into functional features, the genomic sequence must be transcribed. According to the current genome annotation, the human genome produces transcripts from 20,00025,000 protein-coding genes and a smaller number of non-coding genes. Recently, the introduction of tiling arrays that enable the measurement of transcription regardless of previous annotation challenged this view by providing evidence that a large proportion of transcription occurs outside of annotated genes. It has been suggested that these transcripts represent previously unidentified functional RNAs as well as extensions of known genes. However, their lack of conservation between species on the DNA sequence level raises questions about their functionality. In this study we assess the functionality of these novel transcripts by testing the extent to which their expression is conserved between humans and chimpanzees in different tissues. Surprisingly, we find the expression of known and novel transcripts to be conserved to the same relative extent when different tissues are compared. This suggests that both known and novel transcripts are equally affected by adaptive and stabilizing selection in different tissues during human and chimpanzee evolution. Further, in terms of absolute numbers, about half of the expression differences between humans and chimpanzees in any given tissue are due to intergenic transcripts.