Breakpoint analysis of transcriptional and genomic profiles uncovers novel gene fusions spanning multiple human cancer types (RNA-seq)
ABSTRACT: We report the design and implementation of a "breakpoint analysis" pipeline to discover novel gene fusions by tell-tale transcript level or genomic DNA copy number transitions occurring within genes. We use this method to prioritize candidate rearrangements from high density array CGH datasets as well as exon-resolution expression microarrays. We mine both publicly available data as well as datasets generated in our laboratory. Several gene fusion candidates were chosen for further characterization, and corresponding samples were profiled using paired end RNA sequencing to discover the identity of the gene fusion. Using this approach, we report the discovery and characterization of novel gene fusions spanning multiple cancer subtypes including angiosarcoma, pancreatic cancer, anaplastic astrocytoma, melanoma, breast cancer, and T-cell acute lymphoblastic leukemia. Taken together, this study provides a robust approach for gene fusion discovery, and our results highlight a more widespread role of fusion genes in cancer pathogenesis. Breakpoint analysis for the discovery of novel gene fusions across human cancers
Project description:SnowShoes-FTD, a fusion transcript discovery tool, was used to identify fusions in breast cancer cell lines using the RNA-Seq data Total RNA extracted from cell lines. The total RNA was used for construction of RNA-Seq library for RNA-Sequencing.
Project description:We detected fusion genes in 274 fresh surgical samples of gliomas using whole transcriptome sequencing. Using this approach we screened a panel of glioma samples and identified a number of activating novel fusion transcripts. Fusion detection in 274 glioma patients
Project description:Studies of fusion genes have mainly focused on the formation of fusions that result in the production of hybrid proteins or, alternatively, on promoter-switching events that put a gene under the control of aberrant signals. However, gene fusions may also disrupt the transcriptional control of genes that are encoded in introns downstream of the breakpoint. By ignoring structural constraints of the transcribed fusions, we highlight the importance of a largely unexplored function of fusion genes. Using breast cancer as an example, we show that miRNA host genes are specifically enriched in fusion genes and that many different, low-frequency, 5' partners may deregulate the same miRNA irrespective of the coding potential of the fusion transcript. These results indicate that the concept of recurrence, defined by the rate of functionally important aberrations, needs to be revised to encompass convergent fusions that affect a miRNA independently of transcript structure and protein-coding potential. Overall design: Illumina paired-end RNA-sequencing was performed on 1600 sequencing libraries (49 technical replicates, 1552 tumour samples) for fusion gene detection analysis. miRNA sequencing was performed on a subset of the fusion detection samples, 191 sequence libraries (5 technical replicates, 186 tumour samples), for miRNA transcript expression estimation. ------------------------------------ This represents the miRNA sequencing component of 191 libraries only. -------------------------------------- The authors state "due to Swedish law, the patient consent, and the risk that the sequencing data contains personally-identifiable information andhereditary mutations, we cannot deposit the short sequencing read data in a repository". Thus, this submission is incomplete.
Project description:Mutations of TCF4, which encodes a basic helix-loop-helix transcription factor, cause Pitt-Hopkins syndrome (PTHS) via multiple genetic mechanisms. TCF4 is a complex locus expressing multiple transcripts by alternative splicing and use of multiple promoters. We report a three-generation family segregating mild intellectual disability with an apparently balanced chromosomal translocation t(14;18)(q23.3;q21.2) that we characterized as a complex unbalanced karyotype 46,XY,der(14)del(14)(q23.3q23.3)t(14;18)(q23.3;q21.2)del(18)(q21.2q21.2) del(18)(q21.2q21.2)inv(18)(q21.2q21.2),der(18)t(14 ;18)(q23.3;q21.2) disrupting TCF4. Using whole genome sequencing, transcriptome sequencing, qRT-PCR and nCounter analysis, we characterized the breakpoint junctions from derivative chromosomes and gene expression at the TCF4 locus. Our analyses revealed that family members segregating mild intellectual disability with the complex chromosome aberration had normal expression of genes along chromosomes 14 or 18 and no marked changes in expression of genes other than TCF4. Affected individuals had 12-33 fold higher mRNA levels of TCF4 than did unaffected controls or individuals with PTHS. Increased levels of TCF4 transcript variants originating distal to the translocation breakpoint, not the fusion transcript generated by the derivative chromosome, contributed to this increased. Although validation in additional patients is required, our findings suggest that the dysmorphic features and severe intellectual disability characteristic of PTHS is partially rescued by overexpression of short TCF4 transcripts encoding a nuclear localization signal, a transcription activation domain, and the basic helix-loop-helix domain. Examination of TCF4 Isoform expression comparison between mutant and control skin fibroblast tissues
Project description:MLL-fusions represent a large group of leukemia drivers, whose diversity originates from the vast molecular heterogeneity of C-terminal fusion partners of MLL protein. While studies of selected MLL-fusions have revealed critical molecular pathways, unifying mechanisms across all MLL-fusions remain poorly understood. We present the first comprehensive survey of protein-protein interactions of seven distantly related MLL-fusion proteins: MLL-AF1p, MLL-AF4, MLL-AF9, MLL-CBP, MLL-EEN, MLL-ENL and MLL-GAS7.
Project description:Targeted long-read nanopore sequencing.
Abstract: Fusion genes are hallmarks of various cancer types and important determinants for diagnosis, prognosis and treatment. Fusion gene partner choice and breakpoint-position promiscuity restricts diagnostic detection, even for known and recurrent configurations. To accurately and impartially identify fusions, we developed FUDGE: FUsion Detection from Gene Enrichment. FUDGE couples target-selected and strand-specific CRISPR/Cas9 activity for fusion gene driver enrichment - without prior knowledge of fusion partner or breakpoint-location – to long-read Nanopore sequencing with the bioinformatics pipeline NanoFG. FUDGE has flexible target-loci choices and enables multiplexed enrichment for simultaneous analysis of several genes in multiple samples in one sequencing run. We observe on-average 665 fold breakpoint-site enrichment and identify nucleotide resolution fusion breakpoints - within two days. The assay identifies cancer cell line and tumor sample fusions irrespective of partner gene or breakpoint-position. FUDGE is a rapid and versatile fusion detection assay, providing unparalleled opportunity for diagnostic pan-cancer fusion detection.
Project description:Transcription of spa, encoding the virulence factor protein A in Staphylococcus aureus, is tightly controlled by a complex regulatory network, ensuring its temporal expression over growth and at appropriate stages of the infection process. Transcriptomic profiling of XdrA, a DNA-binding protein that is conserved in all S. aureus genomes and shares similarity with the XRE family of helix-turn-helix, antitoxin-like proteins, revealed it to be a previously unidentified activator of spa transcription. To assess how XdrA fits into the complex web of spa regulation, a series of regulatory mutants were constructed; consisting of single, double, triple, and quadruple mutants lacking XdrA and/or the three key regulators previously shown to influence spa transcription directly (SarS, SarA, and RNAIII). A series of lacZ reporter gene fusions containing nested deletions of the spa promoter identified regions influenced by XdrA and the other three regulators. XdrA had almost as strong an activating effect on spa as SarS and acted on the same spa operator regions as SarS, or closely overlapping regions. All data from microarrays, Northern and Western blot analyses, and reporter gene fusion experiments indicated that XdrA is a major activator of spa expression that appears to act directly on the spa promoter and not through previously characterized regulators. Data is also available from http://bugs.sgul.ac.uk/E-BUGS-105
Project description:We have developed FusionSeq to identify fusion transcripts from paired-end RNA-sequencing. FusionSeq includes filters to remove spurious candidate fusions with artifacts such as misalignments or random pairing of transcript fragments and it ranks candidates according to several statistics. It also has a module to identify exact sequences at breakpoint junctions. FusionSeq detected known and novel fusions in a specially sequenced calibration data set, including 8 cancers with and without known rearrangements.
Project description:Histiocytic neoplasms are clonal, hematopoietic disorders characterized by an accumulation of abnormal, monocyte-derived dendritic cells or macrophages in Langerhans Cell (LCH) and non-Langerhans (non-LCH) histiocytoses, respectively. The discovery of BRAFV600E mutations in ~50% of these patients provided the first molecular therapeutic target in histiocytosis. However, recurrent driving mutations in the majority of BRAFV600E-wildtype, non-LCH patients are unknown, and recurrent cooperating mutations in non-MAP kinase pathways are undefined for the histiocytic neoplasms. Through combined whole exome and transcriptome sequencing, we identified recurrent kinase fusions involving BRAF, ALK, and NTRK1, as well as recurrent, activating MAP2K1 and ARAF mutations in BRAFV600E-wildtype, non-LCH patients. In addition to MAP kinase pathway lesions, recurrently altered genes involving diverse cellular pathways were identified. Treatment of MAP2K1- and ARAF-mutated, non-LCH patients using MEK and RAF inhibitors, respectively, resulted in clinical efficacy demonstrating the importance of detecting and targeting diverse kinase alterations in these disorders. 13 patient samples were analyzed by RNA-seq and had 2 replicates.
Project description:Purpose: Assessment of the performance characteristics of an RNA-Seq assay designed to detect gene fusions in 573 genes to aid in the management of cancer patients. Methods: Polyadenylated RNA was converted to cDNA which was then used to prepare NGS libraries that were sequenced on a HiSeq 2500 instrument and analyzed with an in-house developed bioinformatic pipeline. Results: The assay identified 38 of 41 (93%) gene fusions previously detected by a different laboratory using FISH, RT-PCR, or RNA-Seq for a sensitivity of 93%. No false positive gene fusions were identified in 15 normal tissue specimens and 10 tumor specimens that were negative for fusions by RNA-Seq in a different laboratory (100% specificity). The assay also identified 22 fusions in 17 tumor specimens that had not been detected by other methods. Nineteen of the 22 fusions had not previously been described. Good intra- and inter-assay reproducibility was observed with complete concordance for the presence or absence of gene fusions in replicates. The analytical sensitivity of the assay was tested by diluting RNA isolated from gene fusion positive cases with fusion negative RNA. Gene fusions were generally detectable down to 12.5% dilutions for most fusions and as little as 3% for some fusions. The assay identified 38 of 41 (93%) gene fusions previously detected by a different laboratory using FISH, RT-PCR, or RNA-Seq for a sensitivity of 93%. No false positive gene fusions were identified in 15 normal tissue specimens and 10 tumor specimens that were negative for fusions by RNA-Seq in a different laboratory (100% specificity). The assay also identified 22 fusions in 17 tumor specimens that had not been detected by other methods. Nineteen of the 22 fusions had not previously been described. Good intra- and inter-assay reproducibility was observed with complete concordance for the presence or absence of gene fusions in replicates. The analytical sensitivity of the assay was tested by diluting RNA isolated from gene fusion positive cases with fusion negative RNA. Gene fusions were generally detectable down to 12.5% dilutions for most fusions and as little as 3% for some fusions. This assay should be useful for identifying cancer patients that may benefit from both FDA-approved and investigational targeted therapies. Overall design: Sequencing data was generated using Hiseq 2500 with a library of 101 paired end reads in the rapid run mode