Project description:Transcriptome analysis is an important approach to associate genotype with phenotype. The content and dynamics of eukaryotic transcriptome are far more complex than previously anticipated. Here we integrated high-throughput RNA-seq and paired-end method to conduct an unprecedentedly deep survey of transcription profile for cultivated rice, one of the oldest domesticated crops species and has since spread worldwide to become one of the major staple foods. Analysis of reads mapping revealed 4,244 previously uncharacterized transcripts, including a mass of protein-coding genes and putative functional non-coding RNA genes. Alignment of junction reads indicated over 42% of rice multiple-exon genes produce two or more distinct splicing isoforms. It’s intriguing that we identified 1,356 putative gene fusion events, indicating the 234 fusion gene produced by trans-splicing vastly increases the complexity of rice transcriptome, together with the pervasive alternative splicing events. Digital gene expression profiling revealed most rice duplicate genes were maintained by the selection constraint on gene dosages, which would increase the genetic robustness of rice to counteract deleterious mutations Keywords: Expression profiling by high throughput sequencing mRNA expression of 8 independent rice tissues was determined by method of RNA-Seq using short reads from high throughput sequencing technology. Meanwhile small RNA populations from mixture solution pooled from total RNA of each 8 tissues were also sequenced.
Project description:Accurate profiling of minute quantities of RNA in a global manner can enable key advances in many scientific and clinical disciplines. Here, we present low-quantity RNA sequencing (LQ-RNAseq), a high-throughput sequencing-based technique allowing whole transcriptome surveys from subnanogram RNA quantities in an amplification/ligation-free manner. LQ-RNAseq involves first-strand cDNA synthesis from RNA templates, followed by 3' polyA tailing of the single-stranded cDNA products and direct single molecule sequencing. We applied LQ-RNAseq to profile S. cerevisiae polyA+ transcripts, demonstrate the reproducibility of the approach across different sample preparations and independent instrument runs, and establish the absolute quantitative power of this method through comparisons with other reported transcript profiling techniques and through utilization of RNA spike-in experiments. We demonstrate the practical application of this approach to define the transcriptional landscape of mouse embryonic and induced pluripotent stem cells, observing transcriptional differences, including over 100 genes exhibiting differential expression between these otherwise very similar stem cell populations. This amplification-independent technology, which utilizes small quantities of nucleic acid and provides quantitative measurements of cellular transcripts, enables global gene expression measurements from minute amounts of materials and offers broad utility in both basic research and translational biology for characterization of rare cells.
Project description:Defining molecular features that can predict the recurrence of colorectal cancer (CRC) for stage II-III patients remains challenging in cancer research. Most available clinical samples are Formalin-Fixed, Paraffin-Embedded (FFPE). NanoString nCounter® and Affymetrix GeneChip® Human Transcriptome Array 2.0 (HTA) are the two platforms marketed for high-throughput gene expression profiling for FFPE samples. In this study, to evaluate the gene expression of frozen tissue-derived prognostic signatures in FFPE CRC samples, we evaluated the expression of 516 genes from published frozen tissue-derived prognostic signatures in 42 FFPE CRC samples measured by both platforms. Based on HTA platform-derived data, we identified both gene (99 individual genes, FDR < 0.05) and gene set (four of the six reported multi-gene signatures with sufficient information for evaluation, P < 0.05) expression differences associated with survival outcomes. Using nCounter platform-derived data, one of the six multi-gene signatures (P < 0.05) but no individual gene was associated with survival outcomes. Our study indicated that sufficiently high quality RNA could be obtained from FFPE tumor tissues to detect frozen tissue-derived prognostic gene expression signatures for CRC patients.
Project description:As part of the RATHER (RAtional THERapy for breast cancer: individualized treatment for difficult-to-treat breast cancer subtypes) consortium, expression profiling of 144 untreated primary invasive lobular carcinoma (ILC) breast cancer tissues and 15 ILC cell lines was performed using microarray. Gene expression profiling of 144 ILC breast cancers and 15 ILC cell lines
Project description:High-throughput RNA-sequencing has become the gold standard method for whole-transcriptome gene expression analysis, and is widely used in numerous applications to study cell and tissue transcriptomes. It is also being increasingly used in a number of clinical applications, including expression profiling for diagnostics and alternative transcript detection. However, despite its many advantages, RNA sequencing can be challenging in some situations, for instance in cases of low input amounts or degraded RNA samples. Several protocols have been proposed to overcome these challenges, and many are available as commercial kits. In this study, we systematically test three recent commercial technologies for RNA-seq library preparation (TruSeq, SMARTer and SMARTer Ultra-Low) on human biological reference materials, using standard (1?mg), low (100?ng and 10?ng) and ultra-low (<1?ng) input amounts, and for mRNA and total RNA, stranded and unstranded. The results are analyzed using read quality and alignment metrics, gene detection and differential gene expression metrics. Overall, we show that the TruSeq kit performs well with an input amount of 100?ng, while the SMARTer kit shows decreased performance for inputs of 100 and 10?ng, and the SMARTer Ultra-Low kit performs relatively well for input amounts <1?ng. All the results are discussed in detail, and we provide guidelines for biologists for the selection of an RNA-seq library preparation kit.
Project description:<h4>Background</h4>Sequencing technology advancements opened new opportunities to use transcriptomics for studying malaria pathology and epidemiology. Even though in recent years the study of whole parasite transcriptome proved to be essential in understanding parasite biology there is no compiled up-to-date reference protocol for the efficient generation of transcriptome data from growing number of samples. Here, a comprehensive methodology on how to preserve, extract, amplify, and sequence full-length mRNA transcripts from Plasmodium-infected blood samples is presented that can be fully streamlined for high-throughput studies.<h4>Results</h4>The utility of various commercially available RNA-preserving reagents in a range of storage conditions was evaluated. Similarly, several RNA extraction protocols were compared and the one most suitable method for the extraction of high-quality total RNA from low-parasitaemia and low-volume blood samples was established. Furthermore, the criteria needed to evaluate the quality and integrity of Plasmodium RNA in the presence of human RNA was updated. Optimization of SMART-seq2 amplification method to better suit AT-rich Plasmodium falciparum RNA samples allowed us to generate high-quality transcriptomes from as little as 10 ng of total RNA and a lower parasitaemia limit of 0.05%. Finally, a modified method for depletion of unwanted human haemoglobin transcripts using in vitro CRISPR-Cas9 treatment was designed, thus improving parasite transcriptome coverage in low parasitaemia samples. To prove the functionality of the pipeline for both laboratory and field strains, the highest 2-hour resolution RNA-seq transcriptome for P. falciparum 3D7 intraerythrocytic life cycle available to date was generated, and the entire protocol was applied to create the largest transcriptome data from Southeast Asian field isolates.<h4>Conclusions</h4>Overall, the presented methodology is an inclusive pipeline for generation of good quality transcriptomic data from a diverse range of Plasmodium-infected blood samples with varying parasitaemia and RNA inputs. The flexibility of this pipeline to be adapted to robotic handling will facilitate both small and large-scale future transcriptomic studies in the field of malaria.
Project description:Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of transcript abundance. Fine-tuning of read alignment algorithms does not correct this problem. We have developed Seqnature software to construct individualized diploid genomes and transcriptomes for multiparent populations and have implemented a complete analysis pipeline that incorporates other existing software tools. We demonstrate in simulated and real data sets that alignment to individualized transcriptomes increases read mapping accuracy, improves estimation of transcript abundance, and enables the direct estimation of allele-specific expression. Moreover, when applied to expression QTL mapping we find that our individualized alignment strategy corrects false-positive linkage signals and unmasks hidden associations. We recommend the use of individualized diploid genomes over reference sequence alignment for all applications of high-throughput sequencing technology in genetically diverse populations.
Project description:Pelvic inflammatory disease (PID) is a female upper genital tract inflammatory disorder that arises after sexually transmitted bacterial infections (STI). Factors modulating risk for reproductive sequelae include co-infection, microbiota, host genetics and physiology. In a pilot study of cervical samples obtained from women at high risk for STIs, we examined the potential for unbiased characterization of host, pathogen and microbiome interactions using whole transcriptome sequencing analysis of ribosomal RNA-depleted total RNAs (Total RNA-Seq). Only samples from women with STI infection contained pathogen-specific sequences (3 to 38% transcriptome coverage). Simultaneously, we identified and quantified their active microbial communities. After integration with host-derived reads from the same data, we detected clustering of host transcriptional profiles that reflected microbiome differences and STI infection. Together, our study suggests that total RNA profiling will advance understanding of the interplay of pathogen, host and microbiota during natural infection and may reveal novel, outcome-relevant biomarkers.
Project description:Zebrafish have the ability to regenerate many organs and tissues including the melanocytes which are elements of the skin. RNA sequencing (RNA-Seq) is a high-throughput sequencing method facilitating transcript identification and quantification of gene expression in a precise manner. The development of RNA-Seq technologies and their extensive data-analysis methods make an investigation of regulatory genes and functional gene annotations possible under specific conditions. Here, we aim to perform large-scale comparative deep transcriptome profiling of regeneration versus cancer in the following two cellular contexts: The skin melanocytes, which can substantially regenerate in mammals and melanoma which is the type of cancer that begins in the melanocytes.