Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Using RNA-Seq to create sample-specific proteomic databases that enable mass spectrometric discovery of splice junction peptides

ABSTRACT: Many new alternative splice forms have been detected at the transcript level using next generation sequencing (NGS) methods, especially RNA-Seq, but it is not known how many of these transcripts are being translated. Leveraging the unprecedented capabilities of NGS, we collected RNA-Seq and proteomics data from the same cell population (Jurkat cells) and created a bioinformatics pipeline that builds customized databases for the discovery of novel splice-junction peptides. Results: Eighty million paired-end Illumina reads and ~500,000 tandem mass spectra were used to identify 12,873 transcripts (19,320 including isoforms) and 6,810 proteins. We developed a bioinformatics workflow to retrieve high-confidence, novel splice junction sequences from the RNA data, translate these sequences into the analogous polypeptide sequence, and create a customized splice junction database for MS searching.

ORGANISM(S): Homo sapiens

PROVIDER: GSE45428 | GEO | 2013/05/17

SECONDARY ACCESSION(S): PRJNA193719

REPOSITORIES: GEO

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Similar Datasets

Using RNA-Seq to create sample-specific proteomic databases that enable mass spectrometric discovery of splice junction peptides

Project description:Many new alternative splice forms have been detected at the transcript level using next generation sequencing (NGS) methods, especially RNA-Seq, but it is not known how many of these transcripts are being translated. Leveraging the unprecedented capabilities of NGS, we collected RNA-Seq and proteomics data from the same cell population (Jurkat cells) and created a bioinformatics pipeline that builds customized databases for the discovery of novel splice-junction peptides. Results: Eighty million paired-end Illumina reads and ~500,000 tandem mass spectra were used to identify 12,873 transcripts (19,320 including isoforms) and 6,810 proteins. We developed a bioinformatics workflow to retrieve high-confidence, novel splice junction sequences from the RNA data, translate these sequences into the analogous polypeptide sequence, and create a customized splice junction database for MS searching. Jurkat T-cell mRNA was analyzed on an Illumina HiSeq2000. ~80 million paired end reads (2x200bp, ~350bp lengths) were collected.

2013-05-17 | E-GEOD-45428 | biostudies-arrayexpress

Human colorectal cancer CPTAC dataset

Project description:This is a same data set generated by Vanderbilt University. 6M spectra was generated from 95 colorectal cancer samples. We used SpliceDB tool to generate a customized database which includes novel / alternative splice junction and small SNV information.

2015-06-18 | MSV000079166 | MassIVE

Genome-wide analysis of alternative splicing in Caenorhabditis elegans

Project description:Alternative splicing (AS) plays a crucial role in the diversification of gene function and regulation. Consequently, the systematic identification and characterization of temporally regulated splice variants is of critical importance to understanding animal development. We have used high-throughput RNA sequencing and microarray profiling to analyze AS in C. elegans across various stages of development. This analysis identified thousands of novel splicing events, including hundreds of developmentally regulated AS events. To make these data easily accessible and informative, we constructed the C. elegans Splice Browser, a web resource in which researchers can mine AS events of interest and retrieve information about their relative levels and regulation across development. The data presented in this study, along with the Splice Browser, provides the most comprehensive set of annotated splice variants in C. elegans to date, and is therefore expected to faciliate focused, high resolution in vivo functional assays of AS function. Alternative splicing events were identified from alignments of C. elegans mRNA/EST sequences (UniGene Build #26) to C. elegans genomic sequence (NCBI timestamp: Sept. 25, 2006), essentially as previously described (Pan et al. 2005; Pan et al. 2004). In total, 499 cassette type AS events were identified. For each AS event, 3 exon probes and 3 exon junction probes were designed to profile the AS event on the microarray, essentially as previously described (Pan et al. 2004). This submission represents the expression microarray component of the study.

2010-12-14 | E-GEOD-25927 | biostudies-arrayexpress

taxalogue: a toolkit to create comprehensive CO1 reference databases

Project description: Not available

| S-EPMC10702336 | biostudies-literature

Wide-ranging analysis of microRNA profiles in sporadic amyotrophic lateral sclerosis using next-generation sequencing

Project description:miRNAs has an important role in the diagnosis and treatment of amyotrophic lateral sclerosis. we aimed to profile dysregulation of miRNAs in ALS blood and neuromuscular junction as well as healthy blood control by Next Generation Sequencing (NGS). The expression of three up-regulated miRNAs, as miR-338-3p, miR-223-3p and miR-326, in the ALS samples compared to healthy controls, has been validated by qRT-PCR in a cohort of 45 samples collected previously. Bioinformatics tools were used to perform ALS microRNAs target analysis and to predict novel miRNAs secondary structure. The analysis of the NGS data identified 696 and 44 novel miRNAs which were differentially expressed in ALS tissues.

2018-12-12 | E-MTAB-7073 | biostudies-arrayexpress

Nano-HPLC-MS/MS analysis for proteins encoded by circRNA in myoblasts and myotubes

Project description:The MS/MS spectra were searched against a custom protein database from span back-splice junction circRNA open reading frame prediction.

2020-08-21 | PXD021030 |

Versatile toolkit for high-efficiency and scarless overexpression of circular RNAs

Project description:Circular RNAs (circRNAs) are a class of ubiquitously expressed, single-stranded, covalently covalently-closed (i.e. circularised) RNA that contain a unique nucleotide sequence created by the ligation of their 5' and 3' ends, called the back-splice junction. Understanding the cellular roles of circRNAs involves, in part, investigating the effects on cell phenotype of increased expression of individual circRNAs. This is frequently done by transfecting cells with plasmid DNA containing cloned exons from which the circRNA is transcribed, flanked by sequences that promote circularisation. We observed that all commonly used plasmids we tested unexpectedly incorporated molecular scars comprising vector sequence vector sequence as a molecular scar into the circRNA back-splice junction upon circularisation. Stepwise redesign of the cloning vector corrected this problem, ensuring bona fide circRNAs are produced with their natural back-splice junction at high efficiency. The fidelity of circRNAs produced from this new construct was validated by RNA sequencing. To increase the utility of this modified resource for expressing circRNA, we developed an expanded set of vectors incorporating this design that enable selection with a variety of antibiotics and fluorescent proteins, a range of promoters varying in promoter strength and plus a complementary set of lentiviral plasmids for difficult-to-transfect cells. These resources provide a versatile toolkit for high-efficiency and scarless overexpression of circular RNAs that fulfil a critical need for the investigation of circRNA function, including the role of the unique back-splice junction.

2023-12-14 | GSE246020 | GEO

High-resolution custom array spanning Xq22 region to study patients carrying PLP1 copy number gain

Project description:We investigated the features of the genomic rearrangements in a cohort of 50 male individuals with proteolipid protein 1 (PLP1) copy number gain events who were ascertained with Pelizaeus-Merzbacher disease (PMD; MIM: 312080). Genomic rearrangements in PMD individuals with PLP1 copy number gain events were investigated by high-density customized array and breakpoint junction sequence analysis. Analysis of these data enabled the spectrum and relative distribution of the underlying genomic mutational signatures to be delineated. Genomic rearrangements in PMD individuals with PLP1 copy number gain events were investigated by high-density customized array and breakpoint junction sequence analysis.

2019-10-09 | GSE138542 | GEO

Inspecting VWF gene deep intronic region and in-depth study of the endothelial colony-forming cells to identify an underlying pathogenic molecular mechanism in a type 3 VWD patient

Project description:A type 3 von Willebrand disease (VWD) index patient (IP) remains mutation-negative after completion of conventional diagnostic analysis, including multiplex ligation-dependent probe amplification and sequencing of the promotor, exons, and flanking intronic regions of VWF gene (VWF). In this study, we intended to elucidate causitive genetic defect through screening of the whole VWF (including complete intronic region), mRNA analysis, and study of the patient-derived endothelial colony-forming cells (ECFCs). The entire VWF was analyzed by next-generation sequencing (NGS) on an Illumina platform. The NGS revealed a novel variant in VWF intron 8 (997+118 T>G). The subsequent assessments using bioinformatics tools (e.g. SpliceAl) predicted this variant creates a new donor splice site (ss) in intron 8, which could outcompete the consensus 5’ donor ss at exon/intron 8 junction. This leads to an aberrant mRNA which contains a premature stop codon, targeting it to nonsense-mediated mRNA decay. The VWF mRNA from whole blood and isolated ECFCs were quantified using the TaqMan assay on an ABI 7500 real-time PCR system. The quantitative analysis confirmed the virtual absence of VWF mRNA. Additionally, the level of secreted VWF from IP ECFCs was considerably reduced (~6% of healthy donors).

2022-03-30 | GSE195695 | GEO

Rat deconvolution as knowledge miner for immune cell trafficking from toxicogenomics databases

Project description:Toxicogenomics databases are useful for understanding biological responses in individuals because they are derived from well-controlled experiments and include a diverse spectrum of biological responses. Although these databases contain no information regarding immune cells in the liver, which are important in the progression of liver injury, deconvolution that estimates cell-type proportions from bulk transcriptome could add information regarding immune cell trafficking to the database. However, deconvolution has been mainly applied to humans and mice and less often to rats, which are the main target of toxicogenomics databases. Here, we developed a deconvolution method for rats and established a methodology to obtain information regarding immune cells from toxicogenomics databases. The contributions of this work are three-fold. First, we obtained the gene expression profiles of various rat immune cells necessary for deconvolution and constructed a dataset; second, we compared the accuracy of models based on human and mouse datasets and showed the impact of species differences on deconvolution; third, we showed that rat deconvolution could retrieve information regarding immune cell trafficking from toxicogenomics databases. Correspondence: Tadahaya Mizuno

2023-08-15 | GSE239996 | GEO

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data