ABSTRACT: Tissue specific RNA-seq data was generated to assess the effect of the mother’s diet on its offsprings’ transcriptome in utero and after weaning. This study is part of the FAANG project, promoting rapid prepublication of data to support the research community. These data are released under Fort Lauderdale principles, as confirmed in the Toronto Statement (Toronto International Data Release Workshop. Birney et al. 2009. Pre-publication data sharing. Nature 461:168-170). Any use of this dataset must abide by the FAANG data sharing principles. Data producers reserve the right to make the first publication of a global analysis of this data. If you are unsure if you are allowed to publish on this dataset, please contact the FAANG Data Coordination Centre and FAANG Consortium (email faang-dcc@ebi.ac.uk and copy faang@iastate.edu) to enquire. The full guidelines can be found at http://www.faang.org/data-share-principle.
Project description:Several novel MHC class I epitope prediction tools additionally incorporate the abundance levels of the peptides' source antigens and have shown improved performance for predicting immunogenicity. Such tools require the user to input the MHC alleles and peptide sequences of interest, as well as the abundance levels of the peptides' source proteins. However, such expression data is often not directly available to users, and retrieving the expression level of a peptide's source antigen from public databases is not trivial. We have developed the Peptide eXpression annotator (pepX), which takes a peptide as input, identifies from which proteins the peptide can be derived, and returns an estimate of the expression level of those source proteins from selected public databases. We have also investigated how the abundance level of a peptide can be best estimated in cases when it can originate from multiple transcripts and proteins and found that summing up transcript-level expression values performs best in distinguishing ligands from decoy peptides.
Project description:Most organs and tissues are composed of many types of cells. To characterize cellular state, various transcription profiling approaches are currently available, including whole-tissue bulk RNA sequencing, single cell RNA sequencing (scRNA-Seq), and cell type-specific RNA sequencing. What is missing in this repertoire is a simple, versatile method for bulk transcriptional profiling of cell types for which cell type-specific genetic markers or antibodies are not readily available. We therefore developed Probe-Seq, which uses hybridization of gene-specific probes to RNA markers for isolation of specific types of cells, to enable downstream FACS isolation and bulk RNA sequencing. We show that this method can enable isolation and profiling of specific cell types from mouse retina, frozen human retina, Drosophila midgut, and developing chick retina, suggesting that it is likely useful for most organisms.
Project description:Tissue-specific alternative splicing is a key mechanism for generating tissue-specific proteomic diversity in eukaryotes. Splicing regulatory elements (SREs) in pre-mature messenger RNA play a very important role in regulating alternative splicing. In this article, we use mouse RNA-Seq data to determine a positive data set where SREs are over-represented and a reliable negative data set where the same SREs are most likely under-represented for a specific tissue and then employ a powerful discriminative approach to identify SREs. We identified 456 putative splicing enhancers or silencers, of which 221 were predicted to be tissue-specific. Most of our tissue-specific SREs are likely different from constitutive SREs, since only 18% of our exonic splicing enhancers (ESEs) are contained in constitutive RESCUE-ESEs. A relatively small portion (20%) of our SREs is included in tissue-specific SREs in human identified in two recent studies. In the analysis of position distribution of SREs, we found that a dozen of SREs were biased to a specific region. We also identified two very interesting SREs that can function as an enhancer in one tissue but a silencer in another tissue from the same intronic region. These findings provide insight into the mechanism of tissue-specific alternative splicing and give a set of valuable putative SREs for further experimental investigations.
Project description:Lung-specific genes play critically important roles in lung development, lung physiology, and pathogenesis of lung-associated diseases. We performed a meta-analysis of multiple tissue RNA-seq data to identify lung-specific genes in order to better investigate their lung-specific functions and pathological roles. We identified 83 lung-specific genes consisting of 62 protein-coding genes, five pseudogenes and 16 noncoding RNA genes. About 49.4% of lung-specific genes were implicated in the pathogenesis of lung diseases and 21.7% were involved with lung development. The identification of genes with enriched expression in the lung will facilitate the elucidation of lung-specific functions and their roles in disease pathogenesis.
Project description:Aging is a pleiotropic process affecting many aspects of mammalian physiology. Mammals are composed of distinct cell type identities and tissue environments, but the influence of these cell identities and environments on the trajectory of aging in individual cells remains unclear. Here, we performed single-cell RNA-seq on >50,000 individual cells across three tissues in young and old mice to allow for direct comparison of aging phenotypes across cell types. We found transcriptional features of aging common across many cell types, as well as features of aging unique to each type. Leveraging matrix factorization and optimal transport methods, we found that both cell identities and tissue environments exert influence on the trajectory and magnitude of aging, with cell identity influence predominating. These results suggest that aging manifests with unique directionality and magnitude across the diverse cell identities in mammals.
Project description:Alternative splicing is a vital process for regulating gene expression and promoting proteomic diversity. It plays a key role in tissue-specific expressed genes. This specificity is mainly regulated by splicing factors that bind to specific sequences called splicing regulatory elements (SREs). Here, we report a genome-wide analysis to study alternative splicing on multiple tissues, including brain, heart, liver, and muscle. We propose a pipeline to identify differential exons across tissues and hence tissue-specific SREs. In our pipeline, we utilize the DEXSeq package along with our previously reported algorithms. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, we identified 28,100 differentially used exons across the four tissues. We identified tissue-specific exonic splicing enhancers that overlap with various previously published experimental and computational databases. A complicated exonic enhancer regulatory network was revealed, where multiple exonic enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the exonic enhancers are found to be co-occurring with multiple exonic silencers and vice versa, which demonstrates a complicated relationship between tissue-specific exonic enhancers and silencers.
Project description:RNA-sequencing (RNA-seq) is rapidly emerging as the technology of choice for whole-transcriptome studies. However, RNA-seq is not a bias free technique. It requires large amounts of RNA and library preparation can introduce multiple artifacts, compounded by problems from later stages in the process. Nevertheless, RNA-seq is increasingly used in multiple studies, including the characterization of tissue-specific transcriptomes from invertebrate models of human disease. The generation of samples in this context is complex, involving the establishment of mutant strains and the delicate contamination prone process of dissecting the target tissue. Moreover, in order to achieve the required amount of RNA, multiple samples need to be pooled. Such datasets pose extra challenges due to the large variability that may occur between similar pools, mostly due to the presence of cells from surrounding tissues. Therefore, in addition to standard quality control of RNA-seq data, analytical procedures for control of "biological quality" are critical for successful comparison of gene expression profiles. In this study, the transcriptome of the central nervous system (CNS) of a Drosophila transgenic strain with neuronal-specific RNAi of an ubiquitous gene was profiled using RNA-seq. After observing the existence of an unusual variance in our dataset, we showed that the expression profile of a small panel of marker genes, including GAL4 under control of a tissue specific driver, can identify libraries with low levels of contamination from neighboring tissues, enabling the selection of a robust dataset for differential expression analysis. We further analyzed the potential of profiling a complex tissue to identify cell-type specific changes in response to target gene down-regulation. Finally, we showed that trimming 5' ends of reads decreases nucleotide frequency biases, increasing the coverage of protein coding genes with a potential positive impact in the incurrence of systematic technical errors.
Project description:BackgroundQuantification of gene expression such as RNA-Seq is a popular approach to study various biological phenomena. Despite the development of RNA-Seq library preparation methods and sequencing platforms in the last decade, RNA extraction remains the most laborious and costly step in RNA-Seq of tissue samples of various organisms. Thus, it is still difficult to examine gene expression in thousands of samples.ResultsHere, we developed Direct-RT buffer in which homogenization of tissue samples and direct-lysate reverse transcription can be conducted without RNA purification. The DTT concentration in Direct-RT buffer prevented RNA degradation but not RT in the lysates of several plant tissues, yeast, and zebrafish larvae. Direct reverse transcription on these lysates in Direct-RT buffer produced comparable amounts of cDNA to those synthesized from purified RNA. To maximize the advantage of the Direct-RT buffer, we integrated Direct-RT and targeted RNA-Seq to develop a cost-effective, high-throughput quantification method for the expressions of hundreds of genes: DeLTa-Seq (Direct-Lysate reverse transcription and Targeted RNA-Seq). The DeLTa-Seq method could drastically improve the efficiency and accuracy of gene expression analysis. DeLTa-Seq analysis of 1056 samples revealed the temperature-dependent effects of jasmonic acid and salicylic acid in Arabidopsis thaliana.ConclusionsThe DeLTa-Seq method can realize large-scale studies using thousands of animal, plant, and microorganism samples, such as chemical screening, field experiments, and studies focusing on individual variability. In addition, Direct-RT is also beneficial for gene expression analysis in small tissues from which it is difficult to purify enough RNA for the experiments.
Project description:BackgroundTissue-specific RNA plasticity broadly impacts the development, tissue identity and adaptability of all organisms, but changes in composition, expression levels and its impact on gene regulation in different somatic tissues are largely unknown. Here we developed a new method, polyA-tagging and sequencing (PAT-Seq) to isolate high-quality tissue-specific mRNA from Caenorhabditis elegans intestine, pharynx and body muscle tissues and study changes in their tissue-specific transcriptomes and 3'UTRomes.ResultsWe have identified thousands of novel genes and isoforms differentially expressed between these three tissues. The intestine transcriptome is expansive, expressing over 30% of C. elegans mRNAs, while muscle transcriptomes are smaller but contain characteristic unique gene signatures. Active promoter regions in all three tissues reveal both known and novel enriched tissue-specific elements, along with putative transcription factors, suggesting novel tissue-specific modes of transcription initiation. We have precisely mapped approximately 20,000 tissue-specific polyadenylation sites and discovered that about 30% of transcripts in somatic cells use alternative polyadenylation in a tissue-specific manner, with their 3'UTR isoforms significantly enriched with microRNA targets.ConclusionsFor the first time, PAT-Seq allowed us to directly study tissue specific gene expression changes in an in vivo setting and compare these changes between three somatic tissues from the same organism at single-base resolution within the same experiment. We pinpoint precise tissue-specific transcriptome rearrangements and for the first time link tissue-specific alternative polyadenylation to miRNA regulation, suggesting novel and unexplored tissue-specific post-transcriptional regulatory networks in somatic cells.