Proteomics

Dataset Information

0

The choice of search engine affects sequencing depth and HLA allele-specific peptide repertoires.


ABSTRACT: Standardisation of Immunopeptidomics experiments across laboratories is a pressing issue within the field, and currently a variety of different methods for sample preparation and data analysis tools are applied. Here, we compared different software packages commonly used to interrogate immunopeptidomics datasets, in order to understand to which extent differences in performance can be observed. We found that a de novo-assisted database search reports substantially more peptide sequences (~30-70%) compared to three database search engines at a global FDR of <1%. This effect was reproducible across four immunopeptidomic datasets. We validated the results using data generated with a synthetic library of 2000 HLA-associated peptides from four HLA alleles, half of which were previously observed by LC-MS, and half were predicted only. Our investigation reveals that search engines create a bias in peptide sequence length distribution and peptide amino acid composition. Therefore, the choice of peptide identification method highly influences the proportion of peptide sequences identified for each HLA allele, and resulting data should be interpreted with caution.

INSTRUMENT(S): Q Exactive

ORGANISM(S): Homo Sapiens (human)

TISSUE(S): Epithelial Cell, Cell Culture

DISEASE(S): Cervix Carcinoma,Glioblastoma,Cutaneous Malignant Melanoma 1

SUBMITTER: Robert Parker  

LAB HEAD: Nicola Ternette

PROVIDER: PXD025655 | Pride | 2021-08-10

REPOSITORIES: Pride

altmetric image

Publications

The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires.

Parker Robert R   Tailor Arun A   Peng Xu X   Nicastri Annalisa A   Zerweck Johannes J   Reimer Ulf U   Wenschuh Holger H   Schnatbaum Karsten K   Ternette Nicola N  

Molecular & cellular proteomics : MCP 20210723


Standardization of immunopeptidomics experiments across laboratories is a pressing issue within the field, and currently a variety of different methods for sample preparation and data analysis tools are applied. Here, we compared different software packages to interrogate immunopeptidomics datasets and found that Peaks reproducibly reports substantially more peptide sequences (~30-70%) compared with Maxquant, Comet, and MS-GF+ at a global false discovery rate (FDR) of <1%. We noted that these di  ...[more]

Similar Datasets

2018-09-03 | PXD010793 | Pride
2020-07-28 | PXD014337 | Pride
2022-08-12 | PXD025346 | Pride
2018-06-18 | PXD010152 | Pride
2022-04-06 | PXD031709 | Pride
2022-02-17 | PXD023684 | Pride
2021-04-14 | PXD021755 | Pride
2023-01-23 | PXD038862 | Pride
2018-04-10 | PXD008500 | Pride
2019-12-16 | GSE131267 | GEO