Mining the Archives: A Cross-Platform Analysis of Gene Expression Profiles in Archival Formalin-Fixed Paraffin-Embedded Tissues.
ABSTRACT: Formalin-fixed paraffin-embedded (FFPE) tissue samples represent a potentially invaluable resource for transcriptomic research. However, use of FFPE samples in genomic studies has been limited by technical challenges resulting from nucleic acid degradation. Here we evaluated gene expression profiles derived from fresh-frozen (FRO) and FFPE mouse liver tissues preserved in formalin for different amounts of time using 2 DNA microarray protocols and 2 whole-transcriptome sequencing (RNA-seq) library preparation methodologies. The ribo-depletion protocol outperformed the other methods by having the highest correlations of differentially expressed genes (DEGs), and best overlap of pathways, between FRO and FFPE groups. The effect of sample time in formalin (18?h or 3 weeks) on gene expression profiles indicated that test article treatment, not preservation method, was the main driver of gene expression profiles. Meta- and pathway analyses indicated that biological responses were generally consistent for 18 h and 3 week FFPE samples compared with FRO samples. However, clear erosion of signal intensity with time in formalin was evident, and DEG numbers differed by platform and preservation method. Lastly, we investigated the effect of time in paraffin on genomic profiles. Ribo-depletion RNA-seq analysis of 8-, 19-, and 26-year-old control blocks resulted in comparable quality metrics, including expected distributions of mapped reads to exonic, untranslated region, intronic, and ribosomal fractions of the transcriptome. Overall, our results indicate that FFPE samples are appropriate for use in genomic studies in which frozen samples are not available, and that ribo-depletion RNA-seq is the preferred method for this type of analysis in archival and long-aged FFPE samples.
Project description:Transcriptome profiling can provide information of great value in clinical decision-making, yet RNA from readily available formalin-fixed paraffin-embedded (FFPE) tissue is often too degraded for quality sequencing. To assess the clinical utility of FFPE-derived RNA, we performed ribo-deplete RNA extractions on?>?3200 FFPE slide samples; 25 of these had direct FFPE vs. fresh frozen (FF) replicates, 57 were sequenced in 2 different labs, 87 underwent multiple library analyses, and 16 had direct microdissected vs. macrodissected replicates. Poly-A versus ribo-depletion RNA extraction methods were compared using transcriptomes of TCGA cohort and 3116 FFPE samples. Compared to FF, FFPE transcripts coding for nuclear/cytoplasmic proteins involved in DNA packaging, replication, and protein synthesis were detected at lower rates and zinc finger family transcripts were of poorer quality. The greatest difference in extraction methods was in histone transcripts which typically lack poly-A tails. Encouragingly, the overall sequencing success rate was 81%. Exome coverage was highly concordant in direct FFPE and FF replicates, with 98% agreement in coding exon coverage and a median correlation of whole transcriptome profiles of 0.95. We provide strong rationale for clinical use of FFPE-derived RNA based on the robustness, reproducibility, and consistency of whole transcriptome profiling.
Project description:RNA sequencing (RNA-Seq) is often used for transcriptome profiling as well as the identification of novel transcripts and alternative splicing events. Typically, RNA-Seq libraries are prepared from total RNA using poly(A) enrichment of the mRNA (mRNA-Seq) to remove ribosomal RNA (rRNA), however, this method fails to capture non-poly(A) transcripts or partially degraded mRNAs. Hence, a mRNA-Seq protocol will not be compatible for use with RNAs coming from Formalin-Fixed and Paraffin-Embedded (FFPE) samples.To address the desire to perform RNA-Seq on FFPE materials, we evaluated two different library preparation protocols that could be compatible for use with small RNA fragments. We obtained paired Fresh Frozen (FF) and FFPE RNAs from multiple tumors and subjected these to different gene expression profiling methods. We tested 11 human breast tumor samples using: (a) FF RNAs by microarray, mRNA-Seq, Ribo-Zero-Seq and DSN-Seq (Duplex-Specific Nuclease) and (b) FFPE RNAs by Ribo-Zero-Seq and DSN-Seq. We also performed these different RNA-Seq protocols using 10 TCGA tumors as a validation set.The data from paired RNA samples showed high concordance in transcript quantification across all protocols and between FF and FFPE RNAs. In both FF and FFPE, Ribo-Zero-Seq removed rRNA with comparable efficiency as mRNA-Seq, and it provided an equivalent or less biased coverage on gene 3' ends. Compared to mRNA-Seq where 69% of bases were mapped to the transcriptome, DSN-Seq and Ribo-Zero-Seq contained significantly fewer reads mapping to the transcriptome (20-30%); in these RNA-Seq protocols, many if not most reads mapped to intronic regions. Approximately 14 million reads in mRNA-Seq and 45-65 million reads in Ribo-Zero-Seq or DSN-Seq were required to achieve the same gene detection levels as a standard Agilent DNA microarray.Our results demonstrate that compared to mRNA-Seq and microarrays, Ribo-Zero-Seq provides equivalent rRNA removal efficiency, coverage uniformity, genome-based mapped reads, and consistently high quality quantification of transcripts. Moreover, Ribo-Zero-Seq and DSN-Seq have consistent transcript quantification using FFPE RNAs, suggesting that RNA-Seq can be used with FFPE-derived RNAs for gene expression profiling.
Project description:Sequencing technologies now provide unprecedented access to genomic information in archival formalin-fixed paraffin-embedded (FFPE) tissue samples. However, little is known about artifacts induced during formalin fixation, which could bias results. Here we evaluated global changes in RNA-sequencing profiles between matched frozen and FFPE samples. RNA-sequencing was performed on liver samples collected from mice treated with a reference chemical (phenobarbital) or vehicle control for 7 days. Each sample was divided into four parts: (1) fresh-frozen, (2) direct-fixed in formalin for 18 h, (3) frozen then formalin-fixed, and (4) frozen then ethanol-fixed and paraffin-embedded (n = 6/group/condition). Direct fixation resulted in 2,946 differentially expressed genes (DEGs) vs. fresh-frozen, 98% of which were down-regulated. Freezing prior to formalin fixation had ≥ 95% fewer DEGs vs. direct fixation, indicating that most formalin-derived transcriptional effects in the liver occurred during fixation. This finding was supported by retrospective studies of paired frozen and FFPE samples, which identified consistent enrichment in oxidative stress, mitochondrial dysfunction, and transcription initiation pathways with direct fixation. Notably, direct formalin fixation in the parent study did not significantly impact response profiles resulting from chemical exposure. These results advance our understanding of FFPE samples as a resource for genomic research.
Project description:Furan is a mouse and rat hepatocarcinogen. We sought to determine if furan-induced gene expression changes could be detected in paired fresh-frozen and formalin-fixed paraffin embedded (FFPE) samples using one-colour microarrays. All samples in this study (fresh-frozen, 18 hours in formalin, 3 weeks in formalin) were also examined using two-colour microarrays and RNA-seq (ribo-depletion and polyA-enrichment protocols) in order to determine the effect of the technology on gene expression profiles.
Project description:Furan is a mouse and rat hepatocarcinogen. We sought to determine if furan-induced gene expression changes could be detected in paired fresh-frozen and formalin-fixed paraffin embedded (FFPE) samples using RNA-seq (polyA-enrichment protocol). All samples in this study (fresh-frozen, 18 hours in formalin, 3 weeks in formalin) were also examined using one- and two-colour microarrays and RNA-seq (ribo-depletion protocol) in order to determine the effect of the technology on gene expression profiles.
Project description:Furan is a mouse and rat hepatocarcinogen. We sought to determine if furan-induced gene expression changes could be detected in paired fresh-frozen and formalin-fixed paraffin embedded (FFPE) samples using RNA-seq (ribo-depletion protocol). All samples in this study (fresh-frozen, 18 hours in formalin, 3 weeks in formalin) were also examined using one- and two-colour microarrays and RNA-seq (polyA-enrichment protocol) in order to determine the effect of the technology on gene expression profiles.
Project description:RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and clinical samples, especially when specimens are formalin-fixed. To facilitate the use of RNA sequencing beyond cell lines and in the clinical setting, we developed an exome-capture transcriptome protocol with greatly improved performance on degraded RNA. Capture transcriptome libraries enable measuring absolute and differential gene expression, calling genetic variants, and detecting gene fusions. Through validation against gold-standard poly(A) and Ribo-Zero libraries from intact RNA, we show that capture RNA-seq provides accurate and unbiased estimates of RNA abundance, uniform transcript coverage, and broad dynamic range. Unlike poly(A) selection and Ribo-Zero depletion, capture libraries retain these qualities regardless of RNA quality and provide excellent data from clinical specimens including formalin-fixed paraffin-embedded (FFPE) blocks. Systematic improvements across key applications of RNA-seq are shown on a cohort of prostate cancer patients and a set of clinical FFPE samples. Further, we demonstrate the utility of capture RNA-seq libraries in a patient with a highly malignant solitary fibrous tumor (SFT) enrolled in our clinical sequencing program called MI-ONCOSEQ. Capture transcriptome profiling from FFPE revealed two oncogenic fusions: the pathognomonic NAB2-STAT6 inversion and a therapeutically actionable BRAF fusion, which may drive this specific cancer's aggressive phenotype.
Project description:BACKGROUND:The main bottleneck for genomic studies of tumors is the limited availability of fresh frozen (FF) samples collected from patients, coupled with comprehensive long-term clinical follow-up. This shortage could be alleviated by using existing large archives of routinely obtained and stored Formalin-Fixed Paraffin-Embedded (FFPE) tissues. However, since these samples are partially degraded, their RNA sequencing is technically challenging. RESULTS:In an effort to establish a reliable and practical procedure, we compared three protocols for RNA sequencing using pairs of FF and FFPE samples, both taken from the same breast tumor. In contrast to previous studies, we compared the expression profiles obtained from the two matched sample types, using the same protocol for both. Three protocols were tested on low initial amounts of RNA, as little as 100 ng, to represent the possibly limited availability of clinical samples. For two of the three protocols tested, poly(A) selection (mRNA-seq) and ribosomal-depletion, the total gene expression profiles of matched FF and FFPE pairs were highly correlated. For both protocols, differential gene expression between two FFPE samples was in agreement with their matched FF samples. Notably, although expression levels of FFPE samples by mRNA-seq were mainly represented by the 3'-end of the transcript, they yielded very similar results to those obtained by ribosomal-depletion protocol, which produces uniform coverage across the transcript. Further, focusing on clinically relevant genes, we showed that the high correlation between expression levels persists at higher resolutions. CONCLUSIONS:Using the poly(A) protocol for FFPE exhibited, unexpectedly, similar efficiency to the ribosomal-depletion protocol, with the latter requiring much higher (2-3 fold) sequencing depth to compensate for the relative low fraction of reads mapped to the transcriptome. The results indicate that standard poly(A)-based RNA sequencing of archived FFPE samples is a reliable and cost-effective alternative for measuring mRNA-seq on FF samples. Expression profiling of FFPE samples by mRNA-seq can facilitate much needed extensive retrospective clinical genomic studies.
Project description:Accurate transcriptional sequencing (RNA-seq) from formalin-fixation and paraffin-embedding (FFPE) tumor samples presents an important challenge for translational research and diagnostic development. In addition, there are now several different protocols to prepare a sequencing library from total RNA. We evaluated the accuracy of RNA-seq data generated from FFPE samples in terms of expression profiling.We designed a biospecimen study to directly compare gene expression results from different protocols to prepare libraries for RNA-seq from human breast cancer tissues, with randomization to fresh-frozen (FF) or FFPE conditions. The protocols were compared using multiple computational methods to assess alignment of reads to reference genome, and the uniformity and continuity of coverage; as well as the variance and correlation, of overall gene expression and patterns of measuring coding sequence, phenotypic patterns of gene expression, and measurements from representative multigene signatures.The principal determinant of variance in gene expression was use of exon capture probes, followed by the conditions of preservation (FF versus FFPE), and phenotypic differences between breast cancers. One protocol, with RNase H-based rRNA depletion, exhibited least variability of gene expression measurements, strongest correlation between FF and FFPE samples, and was generally representative of the transcriptome from standard FF RNA-seq protocols.Method of RNA-seq library preparation from FFPE samples had marked effect on the accuracy of gene expression measurement compared to matched FF samples. Nevertheless, some protocols produced highly concordant expression data from FFPE RNA-seq data, compared to RNA-seq results from matched frozen samples.
Project description:Formalin-fixed paraffin-embedded (FFPE) tissue samples are routinely archived in the course of patient care and can be linked to clinical outcomes with long-term follow-up. However, FFPE tissues have degraded RNA which poses challenges for analyzing gene expression. Next-generation sequencing (NGS) is rapidly becoming accepted as an effective tool for measuring gene expressions for research and clinical use. However, the feasibility of NGS has not been firmly established when using FFPE tissue.We optimized strategies for whole transcriptome sequencing (RNA-seq) using FFPE tissue. Ribosomal RNA (rRNA) was successfully depleted by competitive hybridization using the Ribo-zero™ Kit (Epicentre Biotechnologies), and rRNA sequence content was less than one percent for each library. Gene expression measured by FFPE RNA-seq was compared to two different standards: RNA-seq from fresh frozen (FF) tissue and quantitative PCR (qPCR). Both FF and FFPE tumors were sequenced on an Illumina Genome Analyzer IIX with an average of 10 million reads. The distribution of FPKMs (fragments per kilobase of exon per million fragments mapped) and number of detected genes were similar between FFPE and FF. RNA-seq expressions from FF and FFPE samples from the same renal cell carcinoma (RCC) correlated highly (r = 0.919 for tumor 1 and r = 0.954 for tumor 2). On hierarchical cluster analysis, samples clustered by patient identity rather than method of preservation. TaqMan qPCR of 424 RCC-related genes correlated highly with FFPE RNA-seq expressions (r = 0.775 for FFPE tumor 1, r = 0.803 for FFPE tumor 2). Expression fold changes were considered, to assess biologic relevance of gene expressions. Expression fold changes between FFPE tumors (tumor 1/tumor 2) correlated well when comparing qPCR and RNA-seq (r = 0.890). Expression fold changes between tumors from different risk groups (our high risk RCC/The Cancer Genome Atlas, TCGA, low risk RCC) also correlated well when comparing RNA-seq from FF and FFPE tumors (r = 0.887).FFPE RNA-seq provides reliable genes expression data, comparable to that obtained from fresh frozen tissue. It represents a useful tool for discovery and validation of biomarkers.