Dataset Information

Identification of Differentially Expressed Splice Variants by the Proteogenomic Pipeline Splicify

ABSTRACT: Proteogenomics, i.e. comprehensive integration of genomics and proteomics data, is a powerful approach identifying novel protein biomarkers. This is especially the case for proteins that differ structurally between disease and control conditions. As tumor development is associated with aberrant splicing, we focus on this rich source of cancer specific biomarkers. To this end, we developed a proteogenomic pipeline, Splicify, which is able to detect differentially expressed protein isoforms. Splicify is based on integrating RNA massive parallel sequencing data and tandem mass spectrometry proteomics data to identify protein isoforms resulting from differential splicing between two conditions. Proof of concept was obtained by applying Splicify to RNA sequencing and mass spectrometry data obtained from colorectal cancer cell line SW480, before and after siRNA-mediated down-modulation of the splicing factors SF3B1 and SRSF1. These analyses revealed 2172 and 149 differentially expressed isoforms, respectively, with peptide confirmation upon knock-down of SF3B1 and SRSF1 compared to their controls. Splice variants identified included RAC1, OSBPL3, MKI67 and SYK. One additional sample was analyzed by PacBio Iso-Seq full-length transcript sequencing after SF3B1 down-modulation. This analysis verified the alternative splicing identified by Splicify and in addition identified novel splicing events that were not represented in the human reference genome annotation. Therefore, Splicify offers a validated proteogenomic data analysis pipeline for identification of disease specific protein biomarkers resulting from mRNA alternative splicing. Splicify is publicly available on GitHub (https://github.com/NKI-TGO/SPLICIFY) and suitable to address basic research questions using pre-clinical model systems as well as translational research questions using patient-derived samples, e.g. allowing to identify clinically relevant biomarkers. This dataset corresponds to a publication in Molecular & Cellular Proteomics 16: 10.1074/mcp.TIR117.000056, 1850–1863, 2017, PMID: 28747380. Mass spectrometry data corresponding to this entry is available at ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD006486.

ORGANISM(S): Homo sapiens

PROVIDER: GSE108140 | GEO | 2017/12/16

SECONDARY ACCESSION(S): PRJNA422579

REPOSITORIES: GEO

ACCESS DATA

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Dataset Information

Identification of Differentially Expressed Splice Variants by the Proteogenomic Pipeline Splicify

Dataset's files

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Identification of differentially expressed splice variants by the proteogenomic pipeline SPLICIFY
2017-07-28 | PXD006486 | Pride

RNA sequencing for human induced pluripotent stem cell cardiomyocyte differentiation
2019-11-15 | GSE137920 | GEO

Transcription profiling of mouse fetal liver feeder layer and endothelial progenitor cells
2007-12-15 | E-GEOD-1727 | biostudies-arrayexpress

An integrated landscape of mRNA and protein isoforms [NGS]
2026-03-12 | GSE261379 | GEO

An integrated landscape of mRNA and protein isoforms [PacBio]
2026-03-12 | GSE261578 | GEO

Exon level integration of proteomics and microarray data
2009-11-24 | GSE19154 | GEO

The RNAseq of 79 small cell lung cancer (sclc) and 7 normal control
2016-02-10 | E-GEOD-60052 | biostudies-arrayexpress

Transcriptomic analyses of primary uveal melanomas
2016-07-03 | E-MTAB-4097 | biostudies-arrayexpress

Transposable element exonization by non-canonical splicing generates a diversity reservoir of functional protein isoforms
2024-12-12 | GSE234223 | GEO