The "grep" command but not FusionMap, FusionFinder or ChimeraScan captures the CIC-DUX4 fusion gene from whole transcriptome sequencing data on a small round cell tumor with t(4;19)(q35;q13).
ABSTRACT: Whole transcriptome sequencing was used to study a small round cell tumor in which a t(4;19)(q35;q13) was part of the complex karyotype but where the initial reverse transcriptase PCR (RT-PCR) examination did not detect a CIC-DUX4 fusion transcript previously described as the crucial gene-level outcome of this specific translocation. The RNA sequencing data were analysed using the FusionMap, FusionFinder, and ChimeraScan programs which are specifically designed to identify fusion genes. FusionMap, FusionFinder, and ChimeraScan identified 1017, 102, and 101 fusion transcripts, respectively, but CIC-DUX4 was not among them. Since the RNA sequencing data are in the fastq text-based format, we searched the files using the "grep" command-line utility. The "grep" command searches the text for specific expressions and displays, by default, the lines where matches occur. The "specific expression" was a sequence of 20 nucleotides from the coding part of the last exon 20 of CIC (Reference Sequence: NM_015125.3) chosen since all the so far reported CIC breakpoints have occurred here. Fifteen chimeric CIC-DUX4 cDNA sequences were captured and the fusion between the CIC and DUX4 genes was mapped precisely. New primer combinations were constructed based on these findings and were used together with a polymerase suitable for amplification of GC-rich DNA templates to amplify CIC-DUX4 cDNA fragments which had the same fusion point found with "grep". In conclusion, FusionMap, FusionFinder, and ChimeraScan generated a plethora of fusion transcripts but did not detect the biologically important CIC-DUX4 chimeric transcript; they are generally useful but evidently suffer from imperfect both sensitivity and specificity. The "grep" command is an excellent tool to capture chimeric transcripts from RNA sequencing data when the pathological and/or cytogenetic information strongly indicates the presence of a specific fusion gene.
Project description:SUMMARY: Chimera is a Bioconductor package that organizes, annotates, analyses and validates fusions reported by different fusion detection tools; current implementation can deal with output from bellerophontes, chimeraScan, deFuse, fusionCatcher, FusionFinder, FusionHunter, FusionMap, mapSplice, Rsubread, tophat-fusion and STAR. The core of Chimera is a fusion data structure that can store fusion events detected with any of the aforementioned tools. Fusions are then easily manipulated with standard R functions or through the set of functionalities specifically developed in Chimera with the aim of supporting the user in managing fusions and discriminating false-positive results.
Project description:An acute myeloid leukemia was suspected of having a t(8;16)(p11;p13) resulting in a KAT6A-CREBBP fusion because the bone marrow was packed with monoblasts showing marked erythrophagocytosis. The diagnostic karyotype was 46,XY,add(1)(p13),t(8;21)(p11;q22),der(16)t(1;16)(p13;p13)/46,XY; thus, no direct confirmation of the suspicion could be given although both 8p11 and 16p13 seemed to be rearranged. The leukemic cells were examined in two ways to find out whether a cryptic KAT6A-CREBBP was present. The first was the "conventional" approach: G-banding was followed by fluorescence in situ hybridization (FISH) and reverse transcription PCR (RT-PCR). The second was RNA-Seq followed by data analysis using FusionMap and FusionFinder programs with special emphasis on candidates located in the 1p13, 8p11, 16p13, and 21q22 breakpoints. FISH analysis indicated the presence of a KAT6A/CREBBP chimera. RT-PCR followed by Sanger sequencing of the amplified product showed that a chimeric KAT6A-CREBBP transcript was present in the patients bone marrow. Surprisingly, however, KATA6A-CREBBP was not among the 874 and 35 fusion transcripts identified by the FusionMap and FusionFinder programs, respectively, although 11 sequences of the raw RNA-sequencing data were KATA6A-CREBBP fragments. This illustrates that although many fusion transcripts can be found by RNA-Seq combined with FusionMap and FusionFinder, the pathogenetically essential fusion is not always picked up by the bioinformatic algorithms behind these programs. The present study not only illustrates potential pitfalls of current data analysis programs of whole transcriptome sequences which make them less useful as stand-alone techniques, but also that leukemia diagnosis still relies on integration of clinical, hematologic, and genetic disease features of which the former two by no means have become superfluous.
Project description:BACKGROUND: Gene fusions arising from chromosomal translocations have been implicated in cancer. RNA-seq has the potential to discover such rearrangements generating functional proteins (chimera/fusion). Recently, many methods for chimeras detection have been published. However, specificity and sensitivity of those tools were not extensively investigated in a comparative way. RESULTS: We tested eight fusion-detection tools (FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse, Bellerophontes, ChimeraScan, and TopHat-fusion) to detect fusion events using synthetic and real datasets encompassing chimeras. The comparison analysis run only on synthetic data could generate misleading results since we found no counterpart on real dataset. Furthermore, most tools report a very high number of false positive chimeras. In particular, the most sensitive tool, ChimeraScan, reports a large number of false positives that we were able to significantly reduce by devising and applying two filters to remove fusions not supported by fusion junction-spanning reads or encompassing large intronic regions. CONCLUSIONS: The discordant results obtained using synthetic and real datasets suggest that synthetic datasets encompassing fusion events may not fully catch the complexity of RNA-seq experiment. Moreover, fusion detection tools are still limited in sensitivity or specificity; thus, there is space for further improvement in the fusion-finder algorithms.
Project description:BACKGROUND:RNA-seq has the potential to discover genes created by chromosomal rearrangements. Fusion genes, also known as "chimeras", are formed by the breakage and re-joining of two different chromosomes. It is known that chimeras have been implicated in the development of cancer. Few publications in the past showed the presence of fusion events also in normal tissue, but with very limited overlaps between their results. More recently, two fusion genes in normal tissues were detected using both RNA-seq and protein data.Due to heterogeneous results in identifying chimeras in normal tissue, we decided to evaluate the efficacy of state of the art fusion finders in detecting chimeras in RNA-seq data from normal tissues. RESULTS:We compared the performance of six fusion-finder tools: FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse and TopHat-fusion. To evaluate the sensitivity we used a synthetic dataset of fusion-products, called positive dataset; in these experiments FusionMap, FusionFinder, MapSplice, and TopHat-fusion are able to detect more than 78% of fusion genes. All tools were error prone with high variability among the tools, identifying some fusion genes not present in the synthetic dataset. To better investigate the false discovery chimera detection rate, synthetic datasets free of fusion-products, called negative datasets, were used. The negative datasets have different read lengths and quality scores, which allow detecting dependency of the tools on both these features. FusionMap, FusionFinder, mapSplice, deFuse and TopHat-fusion were error-prone. Only FusionHunter results were free of false positive. FusionMap gave the best compromise in terms of specificity in the negative dataset and of sensitivity in the positive dataset. CONCLUSIONS:We have observed a dependency of the tools on read length, quality score and on the number of reads supporting each chimera. Thus, it is important to carefully select the software on the basis of the structure of the RNA-seq data under analysis. Furthermore, the sensitivity of chimera detection tools does not seem to be sufficient to provide results consistent with those obtained in normal tissues on the basis of fusion events extracted from published data.
Project description:CIC rearrangements have been reported in two-thirds of EWSR1-negative small blue round cell tumors (SBRCTs). However, a number of SBRCTs remain unclassified despite exhaustive analysis. Fourteen SBRCTs lacking driver genetic events by RNA sequencing (RNAseq) analysis were collected. Unsupervised hierarchical clustering was performed using samples from our RNAseq database, including 13 SBRCTs with non-CIC genetic abnormalities and 2 CIC-rearranged angiosarcomas among others. Remarkably, all 14 study cases showed high mRNA levels of ETV1/4/5, and by unsupervised clustering most grouped into a distinct cluster, separate from other tumors. Based on these results indicating a close relationship with CIC-rearranged tumors, we manually inspected CIC reads in RNAseq data. FISH for CIC and DUX4 abnormalities and immunohistochemical stains for ETV4 were also performed. In the control group, only 2 CIC-rearranged angiosarcomas had high ETV1/4/5 expression. Upon manual inspection of CIC traces, 7 of 14 cases showed CIC-DUX4 fusion reads, 2 cases had DUX4-CIC reads, while the remaining 5 were negative. FISH showed CIC break-apart in 7 cases, including 5 cases lacking CIC-DUX4 or DUX4-CIC fusion reads on RNAseq manual inspection. However, no CIC abnormalities were detected by FISH in 6 cases with CIC-DUX4 or DUX4-CIC reads. ETV4 immunoreactivity was positive in 7 of 11 cases. Our results highlight the underperformance of FISH and RNAseq methods in diagnosing SBRCTs with CIC gene abnormalities. The downstream ETV1/4/5 transcriptional up-regulation appears highly sensitive and specific and can be used as a reliable molecular signature and diagnostic method for CIC fusion positive SBRCTs.
Project description:Transcription factor fusion genes create oncoproteins that drive oncogenesis and represent challenging therapeutic targets. Understanding the molecular targets by which such fusion oncoproteins promote malignancy offers an approach to develop rational treatment strategies to improve clinical outcomes. Capicua-double homeobox 4 (CIC-DUX4) is a transcription factor fusion oncoprotein that defines certain undifferentiated round cell sarcomas with high metastatic propensity and poor clinical outcomes. The molecular targets regulated by the CIC-DUX4 oncoprotein that promote this aggressive malignancy remain largely unknown. We demonstrated that increased expression of ETS variant 4 (ETV4) and cyclin E1 (CCNE1) occurs via neomorphic, direct effects of CIC-DUX4 and drives tumor metastasis and survival, respectively. We uncovered a molecular dependence on the CCNE-CDK2 cell cycle complex that renders CIC-DUX4-expressing tumors sensitive to inhibition of the CCNE-CDK2 complex, suggesting a therapeutic strategy for CIC-DUX4-expressing tumors. Our findings highlight a paradigm of functional diversification of transcriptional repertoires controlled by a genetically aberrant transcriptional regulator, with therapeutic implications.
Project description:Small blue round cell tumors (SBRCTs) are a heterogenous group of tumors that are difficult to diagnose because of overlapping morphologic, immunohistochemical, and clinical features. About two-thirds of EWSR1-negative SBRCTs are associated with CIC-DUX4-related fusions, whereas another small subset shows BCOR-CCNB3 X-chromosomal paracentric inversion. Applying paired-end RNA sequencing to an SBRCT index case of a 44-year-old man, we identified a novel BCOR-MAML3 chimeric fusion, which was validated by reverse transcription polymerase chain reaction and fluorescence in situ hybridization techniques. We then screened a total of 75 SBRCTs lacking EWSR1, FUS, SYT, CIC, and BCOR-CCNB3 abnormalities for BCOR break-apart probes by fluorescence in situ hybridization to detect potential recurrent BCOR gene rearrangements outside the typical X-chromosomal inversion. Indeed, 8/75 (11%) SBRCTs showed distinct BCOR gene rearrangements, with 2 cases each showing either a BCOR-MAML3 or the alternative ZC3H7B-BCOR fusion, whereas no fusion partner was detected in the remaining 4 cases. Gene expression of the BCOR-MAML3-positive index case showed a distinct transcriptional profile with upregulation of HOX-gene signature, compared with classic Ewing's sarcoma or CIC-DUX4-positive SBRCTs. The clinicopathologic features of the SBRCTs with alternative BCOR rearrangements were also compared with a group of BCOR-CCNB3 inversion-positive cases, combining 11 from our files with a meta-analysis of 42 published cases. The BCOR-CCNB3-positive tumors occurred preferentially in children and in bone, in contrast to alternative BCOR-rearranged SBRCTs, which presented in young adults, with a variable anatomic distribution. Furthermore, BCOR-rearranged tumors often displayed spindle cell areas, either well defined in intersecting fascicles or blending with the round cell component, which appears distinct from most other fusion-positive SBRCTs and shares histologic overlap with poorly differentiated synovial sarcoma.
Project description:Next generation sequencing (NGS) technologies have enabled de novo gene fusion discovery that could reveal candidates with therapeutic significance in cancer. Here we present an open-source software package, ChimeraScan, for the discovery of chimeric transcription between two independent transcripts in high-throughput transcriptome sequencing data.http://email@example.comSupplementary data are available at Bioinformatics online.
Project description:CIC-rearranged sarcomas (CRSs) have recently been characterized as a distinct sarcoma subgroup with a less favorable prognosis compared to other small round cell sarcomas. CRSs share morphologic features with Ewing's sarcoma and prior to 2013 were grouped under undifferentiated sarcomas with round cell phenotype by the WHO classification. In this report, whole-genome sequencing and RNA sequencing were performed for an adolescent male patient with CRS who was diagnosed with undifferentiated pleomorphic sarcoma (UPS) by three contemporary institutions. Somatic mutation analysis identified mutations in IQGAP1, CCNC, and ATXN1L in pre- and post-treatment tissue samples, as well as a CIC-DUX4 fusion that was confirmed by qPCR and DUX4 immunohistochemistry. Of particular interest was the overexpression of the translation factor eEF1A1, which has oncogenic properties and has recently been identified as a target of the investigational agent plitidepsin. This case may provide a valuable waypoint in the understanding and classification of CRSs and may provide a rationale for targeting eEF1A1 in similar soft tissue sarcoma cases.