Project description:We Have human breast milk dataset in this project. We predict all spectra in the datasets via Prosit then rescore. We have 100% FDR maxquant search results, and using percolator we get 1%FDR filtered results with andromeda Scores and another with features extracted from Prosit predictions.
Project description:We Have a human yeast dilution dataset in this project. We predict all spectra in the datasets via Prosit then rescore. We have 100% FDR max quant search results using percolator. In addition, we get 1%FDR filtered results with andromeda Scores and another with features extracted from Prosit predictions.
Project description:We Have a pancreas MS2/MS3 dataset in this project. We predict all spectra in the datasets via Prosit then rescore. We have 100% FDR maxquant search results, and using percolator we get 1%FDR filtered results with andromeda Scores and another with features extracted from Prosit predictions.
Project description:It has been shown that integrating peptide property predictions such as fragment intensity into the scoring process of peptide spectrum match can greatly increase the number of confidently identified peptides compared to using traditional scoring methods. Here, we introduce Prosit-XL, a robust and accurate fragment intensity predictor covering the cleavable (DSSO/DSBU) and non-cleavable cross-linkers (DSS/BS3), achieving high accuracy on various holdout sets with consistent performance on external datasets without fine-tuning. Due to the complex nature of false positives in XL-MS, a novel approach to data-driven rescoring was developed that benefits from Prosit-XL’s predictions while limiting the overestimation of the false discovery rate (FDR). We first evaluated this approach using two ground truth datasets (PXD029252, PXD042173) that demonstrate the accurate and precise FDR estimation. Second, we applied Prosit-XL on a proteome-scale dataset (JPST000845, PXD017711), demonstrating an up to ~3.4-fold improvement in PPI discovery compared to classic approaches. Finally, Prosit-XL was used to increase the coverage and depth of a spatially resolved interactome map of intact human cytomegalovirus virions (PXD031911), leading to the discovery of previously unobserved interactions between human and cytomegalovirus proteins.
Project description:Immunopeptidomics aims to identify Major Histocompatibility Complex-presented peptides on every cell that can be used in anti-cancer vaccine development. However, existing immunopeptidomics data analysis pipelines suffer from the non-tryptic nature of immunopeptides, complicating their identification. Previously, peak intensity predictions by MS²PIP and retention time predictions by DeepLC, have been shown to improve tryptic peptide identifications when rescoring peptide-spectrum matches with Percolator. However, as MS²PIP was tailored towards tryptic peptides, we have here retrained MS²PIP to include non-tryptic peptides. Interestingly, the new models not only greatly improve predictions for immunopeptides, but also yield further improvements for tryptic peptides. We show that the integration of new MS²PIP models, DeepLC, and Percolator in one software package, MS²Rescore, increases spectrum identification rate and unique identified peptides with 46% and 36% compared to standard Percolator rescoring at 1% FDR. Moreover, MS²Rescore also outperforms the current state-of-the-art in immunopeptide-specific identification approaches. Integration of immunopeptide MS²PIP models, DeepLC, and Percolator into MS²Rescore thus allows substantial improved identification of novel epitopes from existing immunopeptidomics workflows.
Project description:To compare the rescoring performance on Orbitrap versus timsTOF data, we utilized a comparison dataset comprising both HLA-I and HLA-II peptides measured on an Orbitrap and on a timsTOF. For detailed information on data acquisition, please refer to the original publication by Gravel et al. (PXD038782). In brief, 10 samples were measured in technical triplicate (two technical replicates for the HNSCC sample) on the Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher Scientic, Waltham, USA) and on the timsTOF Pro (Bruker Daltonik, Germany).
Individual spectrum peak files were searched against a database containing 20,598 human UniProt entries downloaded from https://www.ebi.ac.uk/reference_proteomes/ with MaxQuant version 2.0.3.1 and rescored by integrating Prosit's fragment ion intensity predictions, using Oktoberfest (https://github.com/wilhelm-lab/oktoberfest). To perform rescoring on the Orbitrap data we employed the 2020 CID Prosit model with a CE set to 35 for HLA-I peptides, and the 2020 HCD Prosit model with CE set to 30 for the HLA-II peptides. For timsTOF data, rescoring was performed using the TOF Prosit 2023 model with the reported CEs for each PSM.
Rescoring the Orbitrap data resulted in on average 2.5-fold more unique HLA-I peptides and 1.4-fold more unique HLA-II peptides. In contrast, rescoring timsTOF data resulted in a higher increase, with on average 2.8-fold more unique HLA-I peptides and 1.7-fold more unique HLA-II peptides.
Project description:The development of the TMTpro-16plex series expanded the breadth of commercial isobaric tagging reagents by nearly 50% over classic TMT-11plex. In addition to the described 16plex reagents, the proline-based TMTpro molecule can accommodate two additional combinations of heavy carbon and nitrogen isotopes. Here, we introduce the final two labeling reagents, TMTpro-134C and TMTpro-135N, which permit the simultaneous global protein profiling of 18 samples with essentially no missing values. For example, six conditions with three biological replicates can now be perfectly accommodated. We showcase the 18plex reagent set by profiling the proteome and phosphoproteome of a pair of isogenic mammary epithelial cell lines under three conditions in triplicate. We compare the depth and quantitative performance of this data set with a TMTpro-16plex experiment in which two samples were omitted. Our analysis revealed similar numbers of quantified peptides and proteins, with high quantitative correlation. We interrogated further the TMTpro-18plex data set by highlighting changes in protein abundance profiles under different conditions in the isogenic cell lines. We conclude that TMTpro-18plex further expands the sample multiplexing landscape, allowing for complex and innovative experimental designs.
Project description:Citrullination is a key yet underexplored post-translational modification involved in various biological processes. Its identification via mass spectrometry faces challenges like limited enrichment tools and false positives due to mass overlap with deamidation (+0.9840 Da). To address this, we developed a data analysis pipeline integrating the deep learning model Prosit-Cit, trained on ~53,000 spectra from ~2,500 synthetic citrullinated peptides, which improves sensitivity and precision in identifying citrullination sites. This approach has identified up to 14 times more citrullinated sites in human tissue proteomes and revealed new insights, including the first large-scale citrullination mapping in Arabidopsis. This upload includes: 1) Raw files and search SEARCHs from the evaluation dataset, used to assess the precision of citrullination identifications. 2) Raw files, search SEARCHs, and rescoring outcomes from validation experiments conducted on Arabidopsis flowers. 4) Re-analyzed search and rescoring SEARCHs from human (PXD010154) and Arabidopsis (PXD013868) tissue proteomes.
Project description:The development of the TMTpro-16plex series expanded the breadth of commercial isobaric tagging reagents by nearly 50% over classic TMT-11plex. In addition to the described 16plex reagents, the proline-based TMTpro molecule can accommodate two additional combinations of heavy carbon and nitrogen isotopes. Here, we introduce the final two labeling reagents, TMTpro-134C and TMTpro-135N, which permit the simultaneous global protein profiling of 18 samples with no missing values. For example, six conditions with three biological replicates can now be perfectly accommodated. We showcase the 18plex reagent set by profiling the proteome and phosphoproteome of a pair of isogenic breast cancer cell lines under three conditions in triplicate. We compare the depth and quantitative performance of this dataset with a TMTpro-16plex experiment in which two samples were omitted. Our analysis revealed similar numbers of quantified peptides and proteins, with high quantitative correlation. We interrogated further the TMTpro-18plex dataset by highlighting changes in protein abundance profiles under different conditions in the isogenic cell lines. We conclude that TMTpro-18plex further expands the sample multiplexing landscape, allowing for complex and innovative experimental designs.
Project description:Stable isotope labeling is a leading strategy for mass-spectrometry-based peptide quantification. Whereas TMTpro isobaric tagging can quantify up to 16 multiplexed samples in a single experiment, nonisobaric, yet chromatographically indistinguishable, variants of TMTpro reagents can be used in conjunction with the isobaric tag series for various peptide-targeting applications. Here we test the performance of two nonisobaric TMTpro variants, a stable-isotope-free TMTproZero tag and a nearly fully isotope-labeled "super-heavy" variant, shTMTpro, in a targeted assay for peptides of charge state 4+. We label each peptide with TMTproZero or Super Heavy TMTpro reagents and separately spike each peptide into a TMTpro16-labeled background (equal amount of peptide across all 16 channels). We observe that the expected 1:1 reporter ion ratio is distorted when a TMTproZero-labeled peptide is used; however, we note no such interference when shTMTpro substitutes the TMTproZero tag. Our data suggest that using the Super Heavy TMTpro reagent is an improvement over the TMTproZero reagent for the accurate quantification of high-charge-state peptides for trigger-based multiplexed assays.